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Just Looking at Food... 

PAGE 829 

Feeding and hunger states are regulated by the AgRP and the POMC neurons. 
By optically recording their activity in mice, Chen et al. make the unexpected 
observation that simply presenting food to a hungry mouse resets the neuronal 
activation state from one associated with hunger to one associated with satiety, 
even if no food is consumed. This sensory regulation suggests that role of this 
circuit is not restricted to responding to internal energy states. 



A Landscape for Longevity 

PAGE 842 

Energy homeostasis is coordinated both locally in peripheral tissues and distally 
by the nervous system. Examining the relative contributions of these local and 
distal effects to healthy aging, Burkewitz et al. show that AMPK locally and 
cell-autonomously increases longevity; however, to slow aging, AMPK must 
inactivate CRTC-dependent transcription in neurons to generate a systemic catecholamine signal that creates a permissive 
transcriptional and mitochondrial landscape for longevity. 
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Speed Dating for DNA 

PAGE 856 

DNA recombination requires matching up homologous sequences, but doing this efficiently is a challenge given the vast 
amount of DNA in the genome to be sampled. Using single-molecule imaging with ssDNA curtains, Qi et al. find that DNA 
recombinases kinetically discriminate between different lengths of microhomology, using eight nucleotide tracts to interro- 
gate and align homologous DNA sequences. 



Framing Ribosome Excursions 

PAGE 870 

Frameshift-programming mRNAs regulate the translation of alternative protein products from a single transcript. Van et al. 
find that E. coli ribosomes shift reading frames via multiple translocation attempts induced by flanking mRNA structural bar- 
riers. These dynamic excursions permit ribosomes to access alternative codon:anticodon base-pairing along the mRNA, 
thereby unlocking a broad range of frameshift pathways. 



Protein Bet Hedging 

PAGE 882 

Stiffler et al. explore the origins of evolvability by systematically analyzing all sin- 
gle amino acid mutants in an enzyme under selection for a wild-type function 
(ampicillin resistance) and a new function (cefotaxime resistance). The results 
suggest that fluctuating environments select for enzymes with excess activity 
relative to the strength of selection. 



Viral Lessons from the Survivors 

PAGE 893 and PAGE 904 

Flyak et al. characterize a panel of neutralizing antibodies from a human survivor 
of Marburg virus hemorrhagic fever and report that all antibodies bind to a single 
major antigenic site in the viral glycoprotein (GP). Hashiguchi et al. determine 
the crystal structure of the viral GP in complex with one of these antibodies 
and show that the binding site overlaps with the GP binding site to its cellular 
receptor. Remarkably, the same antibodies can also bind to Ebola GP, despite 
the differences in protein sequence among the glyooprotein in the two viruses, 
providing a critical template for development of immunotherapeutics and inhib- 
itors of viral entry. 




V Human Fabs 

g II Monoclonal ^ Marburg 

antibodies glycoprotein 



Cell 160 , February 26, 2015 ©2015 Elsevier Inc. 799 





Cell 



Totally Tubular Protein Delivery Vehicle 

PAGE 952 and PAGE 940 

Bacterial Type VI secretion systems (T6SS) deliver proteins into target cells by a rapid 
contraction of a long sheath assembly. Two papers in this issue unveil the structural orga- 
nization for tube-like contractile sheaths determined by atomic resolution cryoelectron mi- 
croscopy. Kudryashev et al. report the structure of the Vibrio choierae sheath while Clemens 
et al. identify and structurally characterize a new T6SS from Francisella novicida. The struc- 
tures providing insight into how bacterial sheaths can be recycled for multiple rounds of pro- 
tein delivery and define a repeating two-protein substructure that forms the basis for the 
mesh-like architecture of the sheaths. 



Building Bridges Breaks Chromosomes 

PAGE 913 

Marzec et al. identify a mechanism of telomere-induced genome instability that contributes to developing complex rearrange- 
ments in ALT sarcomas. Accumulation of GGGTCA variant repeats on ALT telomeres leads to the aberrant recruitment of 
NR2C/F nuclear receptors, which, in turn, can bridge to their conventional binding sites through the nuclear space. Homol- 
ogous recombination can then lead to insertion of telomeric sequences proximal to NR2C/F binding sites, creating potential 
fragile sites. 

Telomere Inactivation in Aging 

PAGE 928 

Telomerase is required for telomere maintenance and protection. Using single-cell analyses in yeast, Xie et al. identify a new 
layer of lifespan regulation where they demonstrate that early telomerase inactivation leads to accelerated mother cell aging. 
This event is distinct from senescence caused by telomere shortening and is associated with a transient DMA damage response. 

TGF-3 Fuels Resistance 

PAGE 963 

Understanding the cause of therapeutic resistance is crucial for improving the efficacy of cancer therapy. Oshimori et al. show 
that perivascular TGF-p suppresses proliferation but promotes invasion and heterogeneity in squamous cell carcinomas stem 
cells. These cells reprogram anti-oxidant metabolism and resist anti-cancer therapy, leading to tumor recurrence. 

Approaching Cancer with Precision 

PAGE 977 

There is a lack of effective predictive biomarkers to precisely assign optimal therapy to cancer patients. Montero et al. find that 
drug-induced death signaling measured by Dynamic BH3 Profiling predicts chemotherapy response across many cancer 
types treated with different drugs, including combinations of chemotherapies, unraveling its potential use as a predicative 
biomarker for cancer therapy. 

Wiring HIV latency 

PAGE 990 and PAGE 1002 

Establishment of HIV latency is viewed currently as a process that depends on cellular environment and activation state. New 
work from Razooky et al. presents results from a synthetic circuit approach suggesting that the latency program can be modu- 
lated by Tat expression independent of the cell state, supporting a model in which a hardwired latency circuit can be initiated 
autonomously. Rouzine et al. build from these findings to focus on the potential benefit of such a circuit for evolution of 
latency. They develop a mathematical model, consistent with observed patterns during 
infection, where that initial establishment of latency promotes infectivity at mucosal bar- 
riers, suggesting new approaches for therapies focused on eliminating latent virus. 



A Killi App for Aging Research 

PAGE 1013 

Aging is the number one risk factor for many human pathologies, yet it is challenging to 
study as existing vertebrate models are relatively long lived. Harel et al. have developed 
an integrative genome-to-phenotype platform in a naturally short-lived vertebrate, the 
African turquoise killifish, opening the door to high-throughput in vivo modeling of verte- 
brate aging and complex human diseases. 
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This year’s winter in Cambridge, Massachusetts— the home 
of the Cell Press office— has been a humbling experience. 
It has laid bare how poorly suited we as a species are to 
cold weather and brings to mind not only ways of keeping 
warm but also the adaptions of species better suited to the 
cold that we would emulate if we could. 

Like other mammals, we have the ability to produce heat 
through non-shivering thermogenesis. The molecular under- 
pinnings of this are well known — briefly, upregulation of un- 
coupling protein 1 (UCP1) in brown adipose tissue helps 
mitochondria dissipate energy as heat. Although it is known 
that UCP1 expression is sensitive to cold exposure, the 
cellular mechanisms of this are unclear. Recently, the lab of 
Hei Sook Sul has made a significant step forward in identi- 
fying a protein that is critical to UCP1 expression and which 
is induced by cold exposure (Dempersmier et al., 2015). The 
protein, Zfp516, binds to the UCP1 promoter and drives its 
transcription in brown fat, and it further stimulates the brown- 
ing of subcutaneous white adipose tissue (beige fat). Beyond 
its role in keeping us warm, these features make Zfp516 a 
promising target for anti-obesity treatments. 

Although UCP1 is particularly effective at heat generation, 
countless reactions in a cell release energy as heat. New find- 
ings by Riedel et al. (2015) suggest that the heat released by 
catalysis, such as the highly exothermic breakdown of 
hydrogen peroxide to water and oxygen by the enzyme cata- 
lase, has a direct effect on enzyme diffusion. By examining 
single-molecule behavior of catalase and other enzymes, 
they show that the heat generated by catalysis increases 
enzyme diffusion by producing an asymmetric pressure 
wave that applies a force that is centered away from the pro- 
tein’s center of mass. To imagine a single catalytic event, the 
force of a recoiling gun might be one analogy, and though an 
enzyme’s cumulative motion would still be Brownian, I never- 
theless envision its movement as like a small boat weaving 
aimlessly through the warm turquoise waters of a Caribbean 
mangrove. But I digress. The authors suggest that the heat- 
driven pressure waves may lead to a partial unfolding of 
part of an enzyme that could temporarily halt its activity, or 
intriguingly, this force might affect the processivity of molec- 
ular complexes such as RNA or DNA polymerase. 

For many species, these everyday cellular reactions do not 
generate enough warmth to counteract the brutal cold of the 
environments in which they are found. So if homeothermy— 
or hopping a plane to Florida— isn’t an option, what’s a spe- 
cies to do? One answer is to embrace the cold. So-called 
anti-freeze proteins in species ranging from bacteria to verte- 
brates hinder the formation of macroscopic ice by binding 
the surface of nascent ice crystals, halting their growth. A 
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recent study of anti-freeze protein type III (AFP-III) from the 
Antarctic eelpout (a fish that looks a bit like an eel) shows 
that the protein at temperatures higher than freezing has 
“ice-like” water layers around the ice binding site of the pro- 
tein (Meister et al., 2014). This suggests that it is these or- 
dered water molecules and not the protein itself that binds 
to nascent ice crystals. In other words, polar fish have a pro- 
tein that “makes” ice for ice fishing. Once you’ve wrapped 
your head around that, consider this— the anti-freeze pro- 
teins of Antarctic icefish not only inhibit formation of ice crys- 
tals at subzero temperatures, they also impede their melting 
at higher temperatures (Cziko et al., 201 4). Meaning, for spe- 
cies that rely on anti-freeze proteins, microscopic ice crystals 
may persist within them even during periods of above- 
freezing temperatures. Looking outside at the bleak gray- 
white cityscape, I can’t help but think we could use some 
of that ice flowing in our veins until spring comes. 

REFERENCES 

Cziko, P.A., DeVries, A.L, Evans, C.W., and Cheng, C.H. (2014). Proc. Natl. 
Acad. Sci. USA 111, 14583-14588. 

Dempersmier, J., Sambeat, A., Gulyaeva, O., Paul, S.M., Hudak, C.S.S., 
Raposo, H.F., Kwan, H.-Y., Kang, C., Wong, R.H.F., and Sul, H.S. (2015). 
Mol. Cell 57, 235-246. 

Meister, K., Strazdaite, S., DeVries, A.L, Lotze, S., Olijve, L.L.C., Voets, I.K., 
and Bakker, H.J. (2014). Proc. Natl. Acad. Sci. USA m, 17732-17736. 

Riedel, C., Gabizon, R., Wilson, C.A.M., Hamadani, K., Tsekouras, K., 
Marqusee, S., Presse, S., and Bustamante, C. (2015). Nature 577, 227-230. 

Robert Kruger 



Cell 160 , February 26, 2015 ©2015 Elsevier Inc. 803 




Leading Edge 

Previews 



The Hunger Games 

Randy J. Seeley’ * and Kent C. Berridge^ 

’Department of Surgery 
^Department of Psychology 
University of Michigan, Ann Arbor, Ml 48109, USA 
‘Correspondence: seeleyrj@med.umich.edu 
http://dx.d 0 i. 0 rg/l 0.101 6/].cell.201 5.02.028 



Although AgRP and POMC neurons in the hypothalamus have long been associated with regulation 
of food intake, in this issue of Cell, Chen et al. use direct imaging in vivo to demonstrate rapid 
changes in their activity upon food presentation. The rapidity of their altered responses challenges 
classic notions of their functions and raises new hypotheses. 



Food is essential to an organism’s sur- 
vival, and consequently, considerable 
neural circuitry is dedicated to directing 
and regulating ingestive behaviors. Hypo- 
thalamic AgRP and POMC have been 
known as the yin/yang of food intake 
regulation for over a decade (Schwartz 
et al., 2000). They are targets of molecules 
indicating energy status such as leptin, 
ghrelin, and nutrients, with AgRP neurons 
promoting feeding and POMC neurons 
decreasing feeding. However, ap- 
proaches to measuring the activity of 
these neurons have been technically 
limited in terms of monitoring them during 
the act of eating itself. 

In the current edition of Cell, all of this 
changes. Chen et al. (2015) used fiber 
photometry to visualize the activity of 
both AgRP and POMC neurons while hun- 
gry mice began to eat palatable food 
or interact with food odors in their envi- 
ronment. Given that activating AgRP neu- 
rons is thought to cause a robust and 
rapid increase in food intake (Aponte 
et al., 2011), a logical expectation might 
have been that AgRP neuronal activity 
would be high when animals began to 
eat, remain high during the early portion 
of the meal, and gradually decline during 
eating as appetite ebbed. The exact 
opposite pattern would be expected in 
PCMC neurons— a low start followed by 
a gradual rise during eating. What Chen 
et al. found, however, was that while 
AgRP neuronal activity was high in fasted 
mice before encountering food, their 
AgRP neuronal activity decreased in 
mere seconds as soon as food was 
presented and just as eating began. 
Conversely, PCMC activity, while low as 
expected in hungry mice, rose almost 



immediately as soon as the mouse began 
to eat, even though mice continued to eat 
avidly for some time more without being 
inhibited by the initial rise in PCMC 
neuronal activity. If the chow pellet was 
removed midway through the meal, the 
AgRP neurons increased again in activity, 
and the PCMC neurons declined. More- 
over, if mice were given access to more 
attractive food, such as chocolate or pea- 
nut butter, the rapid decrease in AgRP 
activity and increase in PCMC activity 
were even more pronounced. 

These observations have a number 
of important implications. The rapid 
changes in the activity of these neurons 
could not be the result of signals coming 
from the body about fuel status. That is, 
the early PCMC rise could not be a 
physiological satiety signal, nor could 
the early AgRP decline mean that appe- 
tite had disappeared (since the mice 
continued to eat avidly for some time after 
both signals changed). At least, if the 
initial PCMC rise were a satiety signal 
that stops eating, it was a remarkably inef- 
fective one because most of the avid 
eating occurred afterward. Rather, these 
changes must reflect inputs onto these 
neurons that process information about 
the immediate availability and attractive- 
ness of food in the environment. 

What does that mean for understanding 
the regulatory roles of AgRP or PCMC 
neurons? Chen et al. (2015) suggest one 
possibility. They note that hunger would 
promote foraging in addition to eating 
food actually found and propose that the 
role of AgRP neurons is specifically the 
former. A sudden drop in in AgRP as 
soon as food was discovered, they sug- 
gest, “provides a mechanism to rapidly 



inhibit foraging upon the discovery of 
food.” In that case, AgRP and PCMC 
would have a role in appetitive food 
seeking and foraging behaviors but not 
so much in the consummatory eating 
phase of actual biting, chewing, and swal- 
lowing. Splitting appetite into separate 
effects on foraging and consummatory 
behaviors is certainly one way of poten- 
tially solving this puzzle. However, that 
split raises a further puzzle of why earlier 
studies reported that AgRP and PCMC 
manipulations do powerfully control food 
consumption, and so it is not limited to 
foraging behavior (Aponte et al., 201 1). 

A second way of looking at the rapid 
changes in activity is that AgRP may still 
promote the act of eating and intake, 
and PCMC activity inhibits intake, but 
these signals are only the first links in a 
long chain. By that view, the rapid 
changes in AgRP and PCMC neuronal 
activity are not sufficient to inhibit intake 
on their own but might act as the first 
topple in a chain of dominos. After 
some delay, the final domino might be 
another mechanism that successfully 
inhibits eating. 

A third way of looking at the rapid 
response of AgRP and PCMC neurons 
is the alternative view that perhaps 
these signals do not drive eating directly, 
but rather these neurons modulate and 
receive powerful input from brain reward 
circuitry that reacts to cues and foods 
in the environment and that mediates 
current motivation to eat (Figure 1). That 
is, high AgRP (and low PCMC) may 
prime the reactivity of mesocorticolimbic 
circuitry to the sight, smell, and taste 
of food, which generates high incentive 
motivation to eat, rather than simply 
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Figure 1. Integrated Model for the Roles for AgRP and POMC Neurons in Food Respon- 
siveness and Energy Homeostasis 

Traditional models of hypothalamic regulation of food intake (blue arrows) hypothesize that AgRP and 
POMC neurons in the hypothalamus are regulated by signals of fuel availability and, in turn, that AgRP 
activation directly drives eating, whereas POMC activation inhibits eating (hatched blue arrow). Chen et al. 
(2015) challenge this view and show that these neurons’ activity is often disconnected with the act of 
eating itself. Incorporating findings of Chen et al. (2015) into the incentive interpretation we describe, the 
activity of these neurons instead primes the motivational/incentive salience mesocorticolimbic circuitry to 
react to food stimuli, which sustains continued eating, and feeds back to immediately and potently 
regulate hypothalamic neuronal activity (yellow arrows). This embeds hypothalamic function to regulate 
eating into larger circuitry that also incorporates mesocorticolimbic pathways and regulates the varied 
behaviors involved in acquiring and consuming food in a complex environment. 
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causing a hunger drive that more directly 
powers eating. Once eating is triggered 
by that ampiified mesocorticoiimbic 
reaction to food, the high AgRP couid 
be superfiuous to appetite and eating 
behavior and is abie to decline without 
suppressing behavior. Higher mesocorti- 
colimbic reactivity couid sustain eating 
by its own continuing activation, such as 
by higher dopamine ieveis or reiated 
neuronai signais in nucieus accumbens 
or reiated targets (see Figure 1). in turn, 
by this view, mesocorticoiimbic circuitry 
must send feedback signais that food is 
encountered to hypothaiamus, causing 
the early changes in AgRP and POMC 
neurons, so that their activity immediately 
reflects the incentive vaiue of food in the 
moment. 

Other data support this modulatory 
incentive hypothesis. For exampie, star- 
vation signais simiiariy increase meso- 
corticoiimbic reactivity to food in both 
humans and rats (Berthoud, 2012; 
DiLeone, 2009; Farooqi et ai., 2007; Figle- 
wicz and Sipols, 2010) (though compare 
Fuiton et ai. [2006]). Incentive-reiated 
feedback from mesocorticolimbic cir- 
cuitry may aiso expiain another finding 



of Chen et ai. (2015) — nameiy, that the 
rapid AgRP and POMC activity changes 
triggered by mouse chow can be biocked 
if the mouse has just eaten a morsei of 
chocoiate or peanut butter 10 min eariier. 
If the order is reversed, however, eating 
chow first does not block the neural 
responses to a subsequent chocolate 
or peanut butter treat. Eating chocoiate 
first wouid reduce the incentive vaiue 
of chow, but that shouid not occur 
in reverse, and so the rapid AgRP 
and POMC changes accordingiy remain 
robust to both foods. 

This incentive hypothesis of hypotha- 
iamic interaction with mesocorticoiimbic 
circuitry ieads to some further predic- 
tions. For exampie, neutrai cues in the 
environment can gain motivationai vaiue 
when paired with food and activate 
mesocorticoiimbic systems as effectiveiy 
as food itseif. The current findings wouid 
predict that such previousiy neutrai stim- 
uii wouid aiso serve as potent stimuii 
to rapidiy aiter the activity of AgRP and 
POMC neurons if they have been iearned 
as food cues. 

The bottom line is that psychoiogists 
and neuroscientists have spent decades 



investigating the reiationship between 
neurai activity and key aspects of our 
behavior, including motivation, reward, 
and hunger. Chen et ai. (2015) have ush- 
ered in a new chapter where moiecuiar 
markers of activity for the neurons one 
wishes to observe can be directly related 
to ingestive behavior. Here, we have 
iearned that these specific neuronai 
popuiations respond more rapidiy than 
previousiy suspected to information 
about the quaiity of food in their environ- 
ment. Given the importance of these 
neurons beyond ingestive homeostasis 
(Dietrich et ai., 2012; Matarese et ai., 
2013), the implications for this work 
extend to understanding not oniy how 
food intake is reguiated but to a wide 
swath of topics around the reiationship 
between brain and behavior. 
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Aging is a risk factor for chronic diseases, and identifying targets for intervention is a goal of the 
aging field. Burkewitz et al. now describe a mechanism that mediates the specific role for AMPK 
in longevity, whereby its activity in neurons modulates metabolism and mitochondrial integrity in 
peripheral tissues. 



Because aging is the primary risk factor 
for the development of many chronic dis- 
eases, it is a fundamental public health 
problem. Therefore, one goal of the aging 
field is to identify regulatory mechanisms 
that could become targets of intervention. 
Animals adjust their metabolic rates and 
life schedules according to nutrient sta- 
tus. The highly conserved AMP-activated 
protein kinase, AMPK, which is activated 
under low nutrient conditions and is 
required for lifespan extension with die- 
tary restriction (DR), is an attractive target 
for such interventions. However, AMPK 
also affects growth, reproduction, and 
disease development (Mair et al., 2011). 
Therefore, identifying mechanisms of 
AMPK activation that slow aging without 
deleterious effects is important in moving 
AMPK pathway drugs to a clinical appli- 
cation. Previously, Mair and colleagues 
showed that inhibition of the cyclic AMP- 
responsive element (CREB)-regulated 
transcriptional co-activator (CRTC-1) is 
required for AMPK-mediated lifespan 
extension (Mair et al., 201 1). In this issue 
of Cell, Burkewitz et al. (2015) now find 
that CRTC-1 specifically mediates 
AMPK’s role in longevity, but not growth 
or reproduction, through its activity in 
neurons, modulating metabolism and 
mitochondrial integrity in peripheral tis- 
sues. Notably, neuronal AMPK/CRTC-1 
status is dominant to the pathway’s 
activity in peripheral tissues, which has 
implications for the development of 
AMPK-based therapeutics Figure 1. 

To identify the mechanisms under- 
lying the specific effect of CRTC-1 on 
lifespan, the authors first zeroed in on 
transcriptionai targets that correlated 
solely with AMPK/CRTC-1 -dependent 
longevity. This set was enriched for mito- 
chondrial metabolism genes, and metab- 



olomic analyses demonstrated an in- 
crease in TCA cycle intermediates and 
associated metabolites upon AMPK acti- 
vation, suggesting a specific coupling of 
AMPK-mediated metabolic regulation 
and lifespan extension. Moreover, the au- 
thors found that severai of these meta- 
bolic genes were also regulated by 
NHR-49, a functional ortholog of the nu- 
clear receptor PPARa, which activates 
transcription in low energy states, ulti- 
mately acting in an antagonistic manner 
to CRTC-1 . 

CRTC-1 is expressed in neurons and in- 
testine, a major site of longevity regulation 
in C. elegans (Libina et al., 2003). Because 
AMPK is expressed ubiquitously, and 
many of the factors involved in dietary re- 
striction-mediated longevity, including 
CRTC-1 , are found in peripheral tissues, 
AMPK and the CRTC-1/CREB complex 
were previously presumed to directly 
affect metabolism in tissues in which they 
are expressed (Mair et al., 201 1 ). However, 
the authors found that intestinal CRTC-1 
had no effect on longevity, while neuronal 
expression of the constitutively nuclear 
CRTC-1 which is refractory to 

AMPK regulation, was sufficient to sup- 
press the iongevity effects and metabolic 
transcription of AMPK activation, and 
even caused fragmentation of the mito- 
chondrial network in muscle cells. Simi- 
larly, neuronal rescue of NHR-49 in an 
nhr-49 null mutant induced metabolic 
changes in neurons, muscle, and intestine. 
Therefore, the effects of AMPK on periph- 
erai tissues seemed to be modulated by a 
neuron-derived signal. Indeed, the authors 
next identified the neuromodulator octop- 
amine as the AMPK/CRTC-1 -mediated 
signai that alters metabolism in peripheral 
tissues. AMPK/CRTC-1 signaling regu- 
lated the expression of octopamine syn- 



thesis enzymes, and loss of octopamine 
abolished the reduced longevity of 
CRTC-1 animals. Exogenous 

octopamine treatment even phenocopied 
the mitochondrial fragmentation seen in 
muscle tissue upon neuronai CRTC-1 acti- 
vation. Thus, octopamine, acting as the 
AMPK neuronal signal, was able to “over- 
ride” local AMPK signaling in peripheral 
tissues. 

The exact sites of action for some of 
these players still remain to be identified. 
Octopamine synthesis enzymes are ex- 
pressed in the RIC interneurons, a site 
of CRTC-1 localization, but CRTC-1 and 
NHR-49 may also act in additional 
neurons. The specific receptors and 
receiving cells of the octopamine signal 
are also unknown, but given that starva- 
tion induces CREB activity in SIA neu- 
rons to regulate acetylcholine release, it 
will be interesting to examine whether 
SIA neurons and/or acetylcholine activity 
are also involved in the CRTC-1 longevity 
response. Additionally, the direct tran- 
scriptional targets of neuronal NHR-49 
and CREB in this context are not known; 
AMPK’s regulation of growth and repro- 
duction does not involve CRTC-1, and 
CREB’s role in growth is iargely due to 
non-neuronal gene expression (Lakhina 
et al., 2015). Downstream changes in pe- 
ripheral tissues may be regulated by the 
activity of the iongevity transcription 
factors DAF-16 or PQM-1 (Tapper et al., 
2013), as the DAF-16 Associated 
Element (DAE) was overrepresented in 
the promoters of AMPK/CRTC-1 ’s 
downstream transcriptional targets. The 
involvement of these transcription fac- 
tors also suggests that an insulin may 
act as an intermediate signal upstream 
of the peripheral tissues. While these 
are challenging questions, leveraging 
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Figure 1 . Nutrient Status Sensing in Neurons by AMPK May Relay a Signal between Neurons 
to Peripheral Tissues 

(Top) Upon reduced AMPK activity in neurons, CRTC-1 induces octopamine secretion, which aiters 
metaboiic gene expression and causes mitochondriai fragmentation in peripheral tissues, potentiaiiy due 
to the transcriptionai activity of DAF-1 6/PQM-1 within these tissues. Octopamine is most iikeiy sensed by 
intermediate neurons, which may signai to peripherai ceiis via secreted cues such as insulins. (Bottom) 
AMPK/CRTC1 in the human brain may communicate via (nor)epinephrine to the intestine and peripherai 
tissues to reguiate the pro-iongevity factors, such as FOXO. 



the distinct neuronai and peripherai tis- 
sue transcriptionai outputs wili heip un- 
tangie them. 



Neuronai reguiation of peripherai tissue 
responses has been observed in varied 
contexts; for exampie, dietary restriction 



activates the transcription factor SKN-1 
in ASi neurons, which signais peripherai 
tissues to increase metaboiic activity and 
whoie-body respiration (Bishop and Guar- 
ente, 2007), and heat stress activates the 
AFD thermosensory neurons to eiicit sero- 
tonin reiease, which turns on HSF-1 -medi- 
ated transcription in distant germiine tis- 
sues (Tatum et al., 2015). Sensory cues 
can reguiate iongevity of the whole organ- 
ism, as loss of ciliated sensory neurons, 
odorant receptors, and the TRPV1 recep- 
tor extend lifespan in worms and mice (Ap- 
feld and Kenyon 1999; Riera et al., 2014). 
CRTC1 activity in mammalian neurons 
also affects organismal metabolism (Riera 
et al., 2014), and upregulation of AMPK in 
Drosophila neurons increases autophagy 
in the brain as well as intestine (UIgherait 
et al., 2014), underscoring the conserva- 
tion of the signaling logic. The present find- 
ings extend this theme of regulation of 
whole organism and peripheral tissue sta- 
tus by neuronal signaling, but in this case, 
the activity of neuronal AMPK appears to 
have the ability to ignore its own signaling 
elsewhere. At least in worms, it seems 
that perception of nutrient status is more 
important than the actual status in periph- 
eral tissues themselves. While it is not clear 
how often these might become uncoupled, 
this remarkable finding suggests that cur- 
rent therapies aimed primarily at regulating 
AMPK signaling in peripheral tissues may 
be altered by neuronal signaling; or, seen 
in a more promising light, that sensing of 
AMPK status may be sufficient to induce 
beneficial metabolic effects. Therefore, 
future therapeutic investigations should 
include the consideration of effects on 
brain AMPK and CRTC1 signaling, in addi- 
tion to more direct effects in peripheral tis- 
sues themselves. 

REFERENCES 

Apfeld, J., and Kenyon, C. (1999). Nature 402, 
804-809. 

Bishop, N.A., and Guarente, L. (2007). Nature 447, 
545-549. 

Burkewitz, K., Morantte, I., Weir, H.J.M., Yeo, R., 
Zhang, Y., Huynh, F.K., llkayeva, O.R., Hirschey, 
M.D., Grant, A.R., and Mair, W.B. (2015). Cell 
160, this issue, 842-855. 

Lakhina, V., Arey, R.N., Kaletsky, R., Kauffman, A., 
Stein, G., Keyes, W., Xu, D., and Murphy, C.T. 
(2015). Neuron 85, 330-345. 

Libina, N., Berman, J.R., and Kenyon, C. (2003). 
Cell 115, 489-502. 



808 Cell 160, February 26, 2015 ©2015 Elsevier Inc. 





Cell 



Mair, W., Morantte, I., Rodrigues, A.P., Manning, 
G., Montminy, M., Shaw, R.J., and Dillin, A. 
(2011). Nature 470, 404^08. 

Riera, C.E., Huising, M.O., Follett, P., Leblanc, M., 
Halloran, J., Van Andel, R., de Magalhaes Filho, 



C.D., Merkwirth, C., and Dillin, A. (2014). Cell 157, 
1023-1036. 

Tatum, M.C., Ooi, F.K., Chikka, M.R., Chauve, L, Mar- 
tinez-Velazquez, LA., Steinbusch, H.W., Morimoto, 
R.I., and Prahlad, V. (2015). Curr. Biol. 25, 163-174. 



Tapper, R.G., Ashraf, J., Kaletsky, R., Kleemann, 
G., Murphy, C.T., and Bussemaker, H.J. (2013). 
Cell 154, 676-690. 

UIgherait, M., Rana, A., Rera, M., Graniel, J., and 
Walker, D.W. (2014). Cell Rep. 8, 1767-1780. 



Finding the Right Match Fast 

Divya Nandakumar^ and Smita S. PateP * 

■'Department of Biochemistry and Moiecuiar Bioioqy, Rutqers- Robert Wood Johnson Medicai Schooi, 675 Hoes Lane West, Piscataway, 
NJ 08854, USA 

‘Correspondence: pateiss@rutgers.edu 
http://dx.d 0 i. 0 rg/l 0.101 6/j.ceii.201 5.02.007 



DNA recombinases face the daunting task of locating and pairing up specific sequences among 
millions of base pairs in a genome, all within about an hour. Qi et al. show that recombinases solve 
this problem by searching in 8-nt microhomology units, reducing the search space and accelerating 
the homology search. 



Homologous recombination is important 
for repairing stalled replication forks 
and ensuring genetic diversity (Lusetti 
and Cox, 2002). The recombinase that 
mediates homologous recombination 
self-assembles into presynaptic helical 
filaments on single-stranded (ss) DNA to 
search for a sequence match in double- 
stranded (ds) DNA, and then the ssDNA 
displaces the non-complementary strand 
in dsDNA to form a stable synaptic com- 
plex. To ensure genome stability, this 
process must be fast and accurate, but 
how this occurs given the size and 
complexity of genomes has been a 
mystery (Renkawitz et al., 2014). In this 
issue, Qi et al. (2015) now show that a 
minimal homology length requirement 
reduces the search space and acceler- 
ates the search for target homologous 
sequences through a hierarchical search 
mechanism. 

Recognition of homologous sequence 
by the recombinase filament occurs via 
Watson-Crick pairing, and studies of 
£ coli RecA show that 15-18 bases of 
homology are sufficient for stable synap- 
tic complex formation (Hsieh et al., 
1992), long enough to represent a unique 
site in either the £. coli or human 
genome (-^12 nt for £. coli and ~17 nt 
for humans). Given that the entire search 



process occurs within an hour, how does 
the recombinase filament find this unique 
site? Theoretical studies suggest that 
dividing the search process into multiple 
stages and employing smaller groups of 
bases are effective strategies for fast 
and accurate searches (Jiang and Pren- 
tiss, 2014). Consistent with this, multiple 
kinetic intermediates and transient com- 
plexes between the RecA filament and 6- 
to 7-nt homology segments in dsDNA 
have been detected (Ragunathan et al., 
2012). However, the shortest unit of 
homology that can form a stable synap- 
tic complex with the dsDNA remained 
unclear. 

To examine interactions of dsDNA 
sequences with the presynaptic fila- 
ment, Qi et al. monitored complexes of 
dsDNA with the Saccharomyces cere- 
visiae Rad51 filament using single-mo- 
lecule microscopy. In this method, a 
curtain of Rad51 filaments with ATP is 
generated on repeats of Ml 3 ssDNA 
stretched across a flow chamber and 
anchored at both ends. The Rad51 fila- 
ments are then incubated with fluoro- 
phore-labeled dsDNA. After washing 
away unbound dsDNA, the bound 
dsDNAs are visualized and the off rates 
are measured. This method has the 
advantage of simultaneously monitoring 



multiple fluorescent dsDNA complexes 
on several presynaptic filaments in the 
curtain. 

When such experiments were carried 
out with non-homologous 70-bp dsDNA, 
the authors were surprised to find stable 
complexes with lifetimes as long as 
~1 6 min. Analysis of the dsDNA sequence 
revealed that each strand contained short 
tracts of microhomology (3-9 nt in length) 
with the Ml 3 ssDNA, consistent with a 
previous study suggesting that 8-nt ho- 
mology is sufficient for initial base pairing 
(Hsieh et al., 1992). In contrast, dsDNAs 
with less than 8 nt microhomology formed 
unstable complexes, with average half- 
lives of ~0.5 s. 

Strikingly, a ~1 ,300-fold increase in the 
lifetime of complexes was observed when 
the base homology was increased by just 
one nucleotide, from 7 to 8 nt. This degree 
of stabilization was not observed when 
the microhomology was further increased 
to 9 nt or more. Binding energy increases 
by 8 kbT when the microhomology is 
increased from 7 to 8 nt, but only by 
0.4 kbT when going from 8 to 9 nt. In- 
terestingly, subsequent binding energy 
increases of ~0.4 kbT occurred in 3-nt 
increments, consistent with the triplet 
base organization of ssDNA observed in 
the crystal structure of RecA filament 
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Figure 1. Mechanisms to Accelerate the Homology Search Process 

There is experimental evidence for at least three mechanisms that the recombinase presynaptic filament (green) uses to accelerate the search for the homologous 
sequence in the genome. (1) A short region of the filament binds to dsDNA with no or <8-nt homology and slides along the dsDNA looking for longer tracts of 
homology (Ragunathan et al., 2012). (2) The presynaptic filament probes multiple segments of the dsDNA that are separated in sequence through intersegmental 
contact sampling (Forget and Kowaiczykowski, 2012). (3) The filament scans and rejects regions with <8-nt homology that occur at high frequencies in the 
genome. The filament forms relatively stable complexes with 8-nt and longer microhomology dsDNA regions. This decreases the search space and accelerates 
the homology search process (Qi et al., 2015). 



(Chen et al., 2008) and the propagation 
of initial synaptic filaments in 3-nt 
increments (Ragunathan et al., 2011). 
These findings indicate a length-depen- 
dent kinetic discrimination against se- 
quences that are less likely to be fully 
homologous. 

Though the genome sizes of different 
organisms vary dramatically, initial recog- 
nition of 8-nt microhomology regions 
is conserved across the Rad51/RecA 
family. What is the advantage of search- 
ing for homology in 8-nt bits? The likeli- 
hood of a given sequence appearing in a 
genome decreases exponentially as the 
length of the sequence increases; for 
example, in the 12 Mbp genome of 
Saccharomyces cerevisiae, a specific 
7-nt sequence is expected to appear 
~2,947 times, while this figure drops to 



~762 times for an 8-nt sequence 
(Figure 1). A 10-nt sequence appears 
only ^46 times in the S. cerevisiae 
genome, but the 10 bp synaptic complex 
is stable for nearly half an hour. Thus, 
there must be a balance between the 
number of sites needed to scan the 
genome and the stability of the synaptic 
complexes. 

Given that binding was relatively stable 
beyond the microhomology “sweet spot” 
of 8 nt, there must be mechanisms to 
dissociate stable complexes that are 
sampled but incorrect. Indeed, Qi et al. 
also show that competitor dsDNA can 
increase the dissociation rates of 8- and 
12-nt complexes by as much as 3-fold, 
suggesting a “facilitated exchange” 
mechanism. The authors also predict 
that, while longer filaments have more 



sites to interrogate in the genome, in- 
creasing the search time, they also per- 
mit multiple, simultaneous interactions, 
which may decrease the search time. 
In vivo, other factors such as helicases, 
nuclear organization proteins, and chro- 
mosome mobility may aid both the search 
and dissociation of incorrect complexes. 
Future studies with added factors will be 
necessary to refine the homology search 
models. 

Additional mechanisms may work 
together with kinetic discrimination to 
accelerate the homology search (Fig- 
ure 1). For instance, multiple segments 
of long dsDNA are probed simulta- 
neously by “intersegmental contact 
sampling” (Forget and Kowaiczykowski, 
2012). Curiously, this single-molecule 
approach did not observe stable 
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synaptic complex formation with 
extended dsDNA despite the presence 
of numerous microhomology sites. 
Short-range sliding of presynaptic fila- 
ments on dsDNA substrates has also 
been observed, which may speed up 
the search process by ~200 fold (Ragu- 
nathan et al., 2012). 

The advent of elegant single-molecule 
methods has allowed us to better under- 
stand the molecular mechanisms of 
homologous DNA recombination (San- 
chez et al., 2014), but several questions 
remain. The crystal structure of ssDNA- 
bound RecA filament shows that the 
ssDNA has periodic base triplets in nearly 
B form, followed by an extended bond 
(Chen et al., 2008), so the structural basis 
for 8-nt microhomology recognition is not 



clear. Similarly, how the dynamics of the 
presynaptic filament are involved in 
recognizing dsDNA base pairing is not 
known. Finally, while the proposed mech- 
anisms for accelerating the homology 
search may work in concert, they have 
not been observed simultaneously in a 
single study. Further work is needed to 
determine how, or if, the mechanisms 
function together, as well as the relative 
contribution of each to accelerating the 
process. 
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Nuclear receptors bind chromosome ends in “alternative lengthening of telomeres” (ALT) cancer 
cells that maintain their ends by homologous recombination instead of telomerase. Marzec et al. 
now demonstrate that, in ALT cells, nuclear receptors not only trigger distal chromatin associations 
to mediate telomere-telomere recombination events, but also drive chromosome-internal targeted 
telomere insertions (TTI). 



Telomeres, the ends of chromosomes, 
would look just like the products of DNA 
double strand breaks If not for their 
specialized sequences and cohort of 
protective binding proteins. The cellular 
overprollferatlon characteristic of cancer 
requires some means of maintaining 
telomeric sequence through successive 
rounds of replication. For some cells, 
that Involves reactivating telomerase, the 
enzyme that templates the characteristic 
telomere repeats. For others, It means 
relying on a homologous recombination- 
dependent mechanism termed alternative 
lengthening of telomeres (ALT). In this 



issue of Cell, Marzec et al. (2015) Identify 
nuclear receptors as critical components 
In reprogramming normal telomeres to- 
ward ALT. 

In most normal human, somatic cells, 
telomeres shorten with every round of 
DNA replication due to the DNA end repli- 
cation problem and the absence of telo- 
merase. Too short telomeres elicit a DNA 
damage response triggering a permanent 
cell-cycle arrest termed cellular senes- 
cence. Thus, the replicative potential of 
primary cells is limited, restraining the 
growth of pre-cancerous lesions that 
have lost normal growth control. How- 



ever, mutations in cell-cycle regulators 
like p53 and pRB cause senescence 
bypass and restart the march toward 
malignancy. Replication under these 
conditions can lead to further telomere 
shortening and loss of the proteins that 
protect chromosome ends from fusion or 
“repair.” In cases in which telomeres do 
fuse, cells enter a crisis state In which 
fused chromosomes that contain multiple 
centromeres become missegregated or 
become torn apart during mitosis. Cells 
can escape crisis either by re-galning te- 
lomerase expression, for instance by 
mutating the promoter of the human 
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Figure 1. Molecular Events that May Trigger the Alternative Lengthening of Telomeres Pathway 

At normal telomeres, HR is repressed by the high abundance of shelterin proteins. For ALT formation, nuclear receptors accumulate at telomeres, binding to 
variant telomeric repeat sequences. Nuclear receptor binding promotes formation of chromatin clusters that favor HR and spreading of telomeric variant repeats, 
which in turn will promote further nuclear receptor binding. Nuclear receptors may also promote recruitment of chromatin remodelers, which favor HR protein 
association and counteract the presence of shelterin. While ATRX loss will also promote telomeric chromatin remodeling, it upregulates the long noncoding RNA 
TERRA. Misregulated TERRA forms recombination-prone R loops that activate HR. TERRA also perturbs telomeric protein composition favoring association of 
RPA with telomeric DNA, which activates the ATR checkpoint kinase. Whether ATRX loss and nuclear receptor binding to telomeres are sufficient to trigger ALT 
remains to be tested. 



telomerase reverse transcriptase (hTERT) 
gene, or by engaging the ALT pathway to 
maintain teiomeres. 

ALT is found in ~^0% of cancers and 
is prevaient in sarcomas and giiobias- 
tomas. ALT teiomeres are maintained 
by the homoiogous recombination (HR) 
machinery. Recombination occurs be- 
tween teiomeres of separate chromo- 
somes. HR and teiomere ciustering 
are repressed at normai teiomeres by 
the teiomeric sheiterin proteins (Sfeir 
and de Lange, 2012). Criticai changes 
occur in ALT teiomeric chromatin to over- 
come the sheiterin-mediated repression 
of HR. 

Orphan nuciear receptors of the NR2C/F 
class that classically regulate gene ex- 
pression have also been found at ALT 
telomeres (Dejardin and Kingston, 2009). 
Subsequent work demonstrated that the 
nuclear receptors bind to variant telomeric 
repeat sequences (5'-GGGTCA-3' instead 
of the canonical 5'-GGGTTA-3' repeats) 
that are scarce in normal telomeres but 
that accumulate at ALT telomeres (Cono- 



moset al., 2012). In the new study, Marzec 
et al. identified tandem 5'-GGGTCA-3' 
repeats accumulating at ALT-telomeres 
bound by NR2C/F nuclear receptors. 
Importantly, nuclear receptor binding to 
telomeres induced telomere cluster for- 
mation, which is required for HR in ALT 
cells. Intriguingly, tethering a NR2C2-lacl 
fusion protein to a single LacO array not 
only led to colocalization of the array with 
telomeres, but also triggered its rapid 
amplification and spreading to other sites 
in the genome. It thus appears that the 
telomere clusters are sites of highly acfive 
recombination, and NR2C/F-mediated 
recruitment of DNA to this locus is suffi- 
cient to make this DNA recombine and 
spread elsewhere in the genome. 

Consistent with this notion, Marzec 
et al. also discovered that a subset of 
chromosomal NR2C/F binding sites in 
ALT cells are locations of targeted telo- 
mere insertions (TTI). These newly in- 
serted interstitial telomeric sequences 
may promote genome instability in ALT 
cells, as telomeric DNA is fragile and 



difficult to replicate. The association of 
chromosome-internal NR2C/F bind- 
ing sites with telomeres may explain 
the spreading of the NR2C/F-bound 
5'-GGGTCA-3' repeats into telomeric 
tracts. Alternatively, the 5'-GGGTCA-3' 
sequences may amplify from the rare 
telomeric copies. Thus, the new work 
suggests a sequence of molecular 
events that may occur during the evo- 
lution of ALT cells from normal cells. 
Critically short telomeres, missing key 
shelterin proteins, may expose the 
scarce 5'-GGGTCA-3' sequences for 
NR2C/F binding. Binding of NR2C/F 
at telomeres and its binding at chro- 
mosome internal sites would then 
promote chromatin clustering, HR, and 
5'-GGGTCA-3' spreading, which in turn 
would facilitate further NR2C/F binding, 
telomere cluster formation, and recom- 
bination in a feedforward loop reaction 
(Figure 1). Thus, in this hypothetical 
scenario, HR would reinforce itself in 
ALT once it was triggered by rare initi- 
ating events. 
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Is nuclear receptor binding to teiomeres 
sufficient to trigger ALT? Probabiy not. 
Spreading of the receptor binding sites 
at teiomeres using a mutant version of 
teiomerase did not trigger ALT, aithough 
it was sufficient to induce some ALT-spe- 
cific features such as accumuiation of 
singie-stranded teiomeric (CCCTAA)(n) 
DNA circies (C-circies) (Conomos et ai., 
2012). Recent work has pinpointed addi- 
tionai distinct events required for ALT. 
These inciude recruitment or mutation of 
distinct chromatin remodeiing factors 
that contribute to dispiacement of shei- 
terin, HR factor binding, and HR activa- 
tion. Among these, binding of the histone 
deacetyiase NuRD is sustained by nuciear 
receptors (Conomos et ai., 2014). Muta- 
tions in other factors may support ALT 
through mismanagement of hisfone as- 
sembiy. For exampie, depietion of the 
histone chaperone ASF1 induced ALT 
features, although this protein is not 
generaiiy mutated in ALT-utiiizing cancers 
(O’Suiiivan et ai., 2014). However, a 
strong correiation was found between 
ALT status and mutations in the SWI/ 
SNF famiiy ATP-dependent heiicase 
ATRX (Lovejoy et ai., 2012), and ATRX- 
ioss was recently intimately linked to the 
onset of recombination at ALT telomeres 
(Flynn et al., 2015). ATRX loss leads to 
upregulation of the teiomeric long non- 
coding RNA TERRA, which may perturb 
protein association with single-stranded 
teiomeric DNA, coinciding with accumu- 



lation of replicaf ion protein A (RPA) at telo- 
meres. RPA activates the DNA damage 
protein kinase ATR, which seems impor- 
tant for ALT as ATR inhibition led to selec- 
tive killing of ALT cells (Flynn et al., 2015). 
At the same time, loss of ATRX and 
TERRA upregulation in S phase may pro- 
mote the formation of teiomeric R loops. 
In R loops, an RNA strand is base paired 
with the template DNA strand of a DNA 
duplex, leaving the displaced non-tem- 
plate DNA single stranded. Teiomeric R 
loops are repressed at normal telomeres 
(Pfeiffer et al., 2013), but they become 
prevalent in ALT cells, where they pro- 
mote recombination between teiomeric 
repeats (Arora et al., 2014). Overall, nu- 
clear receptor accumulation at telo- 
meres and ATRX loss seem to represent 
two essential triggering events for ALT 
(Figure 1). 

In summary, the new work by Marzec 
et al. elucidates critical roles for nuclear 
receptors in mediating teiomeric chro- 
matin associations in ALT that are essen- 
tial for recombination. The work provides 
a model for how nuclear receptor binding 
sites spread at telomeres and uncovers 
TTI as a novel mechanism of genome 
instability in ALT cells. In combination 
with other recent results, these findings 
support the hypothesis that ALT activa- 
tion depends on several molecular 
events. Finally, this complexity may 
explain why teiomerase reactivation 
instead of ALT is the more frequently 



selected route toward immortality of can- 
cer cells. 
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The quest to slow aging has come far, and what used to be the domain of science fiction writers and 
snake oil salesmen may soon become science fact. Innovative new approaches, such as the use of 
the very short-lived African killifish (Harel et al.), are bridging the translational gap and bring the 
promise of healthy longevity to fruition. 



Over the past couple of decades, a 
tremendous amount has been discovered 
about the biology of aging from studies 
largely conducted in classic model organ- 
isms: budding yeast, nematode worms, 
fruit flies, and mice. However, by relying 
almost exclusively on a single vertebrate 
model and by focusing on animais main- 
tained under laboratory conditions, we 
are left with a relatively poor understand- 
ing of the extent to which these models 
translate to other animals and environ- 
ments and eventually to humans. This 
has led to a growing recognition in the 
field that innovative new tools and ap- 
proaches are needed. In the current issue 
of Cell, Harel et al. (2015) describe just 
such a new tool: the short-lived African 
killifish Nothobranchlus furzerl. We are 
now entering a period where 
new animal models and ap- 
proaches will allow us to delve 
deeply into the conserved 
mechanisms of aging and 
bridge the final translational 
gap from laboratory models 
to people. 

In many ways, the African 
killifish combines some of 
the best features of the major 
model systems in one animal. 

With a 4-6 month lifespan 
that is still comparable to 
the well-characterized inver- 
tebrate aging models, the 
African killifish is the short- 
est-lived vertebrate that can 
be cultivated in the laboratory 
easily and relatively inexpen- 
sively. This naturally com- 
pressed lifespan allows for 
a unique opportunity to study 
mortality, physiology, and 



age-related disease in a vertebrate model 
with blood, bones, and an adaptive im- 
mune system. Nothobranchlus exhibits 
many age-dependent phenotypes and 
pathologies, including decreased fertility, 
sarcopenia, cognitive decline, and cancer 
(Di Cicco et al., 2011; Hartmann et al., 
2009; Terzibasi et al., 2009), and even 
has telomeres that resemble humans 
both in length and in progressive decline. 
Researchers also benefit from the exis- 
tence of both inbred and wild-caught 
strains in this species, providing a useful 
tool for genetic mapping and comparative 
genomics studies, as well as a short life 
cycle, large brood size, and ease of drug 
administration. All of these attributes 
make this model an ideal candidate for 
high-throughput in vivo drug screens. 



Until now, research on this burgeoning 
modei has been limited by the lack of a 
sequenced genome and genetic tools to 
manipulate gene function and expression. 
Harel et al. (201 5) utilized de novo genome 
assembly to produce a fully sequenced 
and annotated genome and CRISPR/ 
Cas9 technology to generate mutant al- 
leles for 13 different genes associated 
with aging. As proof of principle, they 
focus on the protein subunit of telomerase 
(TERT) as a model of telomere attrition. By 
targeting the catalytic domain of TERT, 
the authors were able to create a TERT 
loss-of-function mutant that results in 
shortened telomeres, reduced fertility, 
and deficiencies in other proliferative tis- 
sues such as blood and intestine. In doing 
this, the authors achieve in 2 months what 
takes several generations in 
mice (Rudolph et al., 1999) 
and 6-8 months in zebrafish 
(Anchelin et al., 2013), gener- 
ating the fastest vertebrate 
model of telomere shortening 
and firmly establishing the 
killifish as a promising and 
tractable platform to investi- 
gate vertebrate aging. 

As Figure 1 shows, the 
African killifish fills an impor- 
tant evolutionary gap in aging 
models between mammal 
models, which diverged from 
humans 40-90 million years 
ago, and the invertebrate 
models that diverged 900 
million years ago or more. 
Comparative biological ap- 
proaches that incorporate ex- 
ceptionally long-lived models 
such as the naked mole rat 
and some species of clams 
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Figure 1. Evolutionary Relationships between Humans and Organ* 
isms Frequently Used in Aging Research 

Approximate average lifespan shown for each organism. The African killifish 
(Harel et al., 201 5) has an average lifespan comparable to invertebrate models 
while being much more closely related to people. Ancestors of humans and 
killifish diverged approximately 400 million years ago (red line). Thus, Notho- 
branchius fills a gap (gray box) between the mammalian models that are 
separated from humans by ~100 million years and invertebrate models, which 
diverged ~900 million years ago. The asterisk (*) denotes the shared envi- 
ronment between humans and dogs. 
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will also undoubtedly play an Important 
role in further contributing to progress in 
this area. At the same time, two other spe- 
cies are now emerging that we believe 
may prove crucial to bridging the transla- 
tional gap to humans. The first of these Is 
the common marmoset, Callithrixjacchus. 
In addition to being a short-lived anthro- 
poid primate, with an average lifespan of 
about a decade, marmosets suffer from 
many of the same age-related diseases 
and declines In function across a variety 
of tissue and organ systems as people 
do (Tardlf et al., 2011). Marmosets are 
also becoming more commonly studied 
In the lab, with a growing set of genomic 
and transgenic tools available. Research 
on aging In marmosets Is continuing at 
several primate research centers and 
should teach us much about the roles of 
genetic and environmental variation In 
natural aging processes in primates. 

The other animal we believe has the po- 
tential to be a “game changer” In aging 
research Is the domestic dog, Canis lupus 
familiahs. Although dogs are as evolutlon- 
arily distant from humans as mice are, 
they share our environment in a way that 
can’t be replicated in the lab. Moreover, 
the extensive level of pre-existing veteri- 
nary knowledge and quality of health 
care In dogs, including geriatric ones. Is 



second only to that of humans. Recent 
studies have described the dramatic dif- 
ferences among breeds in life expec- 
tancy, patterns of age-specific mortalify, 
and causes of death (Fleming et al., 
2011; Waters, 2011). Aside from the 
Important Insights that dogs can provide 
Into human aging, promoting healthier 
longevity in companion animals has 
Intrinsic value for the millions of people 
who own them. These features, combined 
with their genetic and phenotypic diver- 
sity, well-characterized breed structure, 
and (unfortunately!) short lifespans, 
make companion dogs a compelling 
choice for both longitudinal and Interven- 
tional studies of aging. 

Tradiflonal laboratory models have led 
to large advances In our understanding 
of the basic mechanisms of aging. 
Despite this progress, the extent to which 
these mechanisms affect human aging 
and age-related disease has yet to be 
determined. Further, the complex effects 
of environmental and genetic variation 
make for a large leap from genetically 
Identical organisms In tightly controlled 
laboratory conditions to genetically 
diverse humans exposed to many 
different environments. Studies In non- 
tradltional animal models of aging may 
hold the key to filling this gap. The addi- 



tion of the African killifish as a new aging 
model, as well as the extension of studies 
to primates that are more closely related 
to humans and to canines that share our 
environment and model phenotypic diver- 
sity, holds promise in the ongoing quest to 
combat aging and age-related diseases. 
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While modernization has dramatically increased lifespan, it has also witnessed the increasing prev- 
alence of diseases such as obesity, hypertension, and type 2 diabetes. Such chronic, acquired dis- 
eases result when normal physiologic control goes awry and may thus be viewed as failures of 
homeostasis. However, while nearly every process in human physiology relies on homeostatic 
mechanisms for stability, only some have demonstrated vulnerability to dysregulation. Additionally, 
chronic inflammation is a common accomplice of the diseases of homeostasis, yet the basis for this 
connection is not fully understood. Here we review the design of homeostatic systems and discuss 
universal features of control circuits that operate at the cellular, tissue, and organismal levels. We 
suggest a framework for classification of homeostatic signals that is based on different classes of 
homeostatic variables they report on. Finally, we discuss how adaptability of homeostatic systems 
with adjustable set points creates vulnerability to dysregulation and disease. This framework high- 
lights the fundamental parallels between homeostatic and inflammatory control mechanisms and 
provides a new perspective on the physiological origin of inflammation. 



Introduction 

Changes in human ecology— including diet, physical activity, 
population density, and microbial exposure— have dramatically 
shifted the spectrum of human diseases over the past century. 
Genes selected to protect from starvation, infections, injury, 
and predation may now, in the absence of some of these chal- 
lenges, contribute to the increasing incidence of “modern human 
diseases,” including obesity, type 2 diabetes, atherosclerosis, 
autoimmunity, allergy, and certain psychiatric disorders. Plau- 
sible evolutionary explanations for the high prevalence of these 
diseases in industrialized countries include antagonistic pleiot- 
ropy (Williams, 1957) and the mismatch between modern envi- 
ronment and human evolutionary history (Gluckman et al., 
2009; Stearns and Koella, 2008). 

These modern human diseases seem to have two features in 
common: they involve disruption of homeostasis, and they are 
nearly universally associated with chronic inflammation. Despite 
this well-documented connection between inflammation and 
diseases of homeostasis, the underlying evolutionary and mech- 
anistic bases remain obscure. In most complex diseases, in 
contrast to rare Mendelian diseases, the pathological state has 
a normal, physiological counterpart. The etiology of modern hu- 
man diseases may therefore point to the physiological rationale 
connecting inflammation and homeostasis. 

Most physiological processes can only operate under a 
narrow range of conditions, which are maintained by special- 
ized homeostatic mechanisms in the face of variations in 
the environment, and adjusted in response to changes in 



functional demands and biological priorities. Interestingly, 
only some of these processes are vulnerable to dysregula- 
tion and disease. For example, lipid and glucose metabolism 
can be derailed, leading to dyslipidemia, diabetes, and 
obesity, while amino acid metabolism seems resistant to ho- 
meostatic dysregulation. Here we present a view that may 
help explain the differential susceptibility of physiological 
processes to diseases of homeostasis. We explore the funda- 
mental connections between homeostasis and inflammation 
and discuss an evolutionary perspective on homeostatic 
diseases. 

Homeostatic Variables and Control Circuits 

In the 1 9*^ century, Claude Bernard articulated the need to main- 
tain a stable internal environment— m/7/eu /nferieur— that would 
allow biological processes to proceed despite variations in the 
external environment (Bernard, 1878). Bernard’s concept was 
further explored, developed, and popularized by Walter Cannon, 
who coined the term “homeostasis” in describing how key phys- 
iological variables are maintained within a predefined range by 
feedback mechanisms (Cannon, 1929). His contemporary, Curt 
Richter, expanded the notion of homeostasis to include behav- 
ioral responses as an important mechanism by which homeosta- 
sis could be regulated in addition to the internal controls systems 
described by Bernard and Cannon (Moran and Schulkin, 2000; 
Richter, 1943). Nearly two decades after Cannon, James Hardy 
proposed a model in which homeostatic mechanisms maintain 
physiological variables within an acceptable range by comparing 
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Table 1. Glossary 


Term 


Definition 


Examples 


Stock 


A system’s variable that represents quantity 


Blood glucose concenfrafion 


Flow 


A system’s variable that represents a process that changes the 
stock 


Gluconeogenesis, glycogenolysis, glycolysis, 
gluconeogenesis, glucose fransport 


Regulated variable 


A physiologic variable that is maintained at a stable level (near set 
point) by homeostatic circuit(s). Regulated variables are stocks 


Blood glucose concenfrafion 


Controlled variable 


A physiologic variable that is manipulated in order to maintain the 
regulated variable within desired range. Controlled variables are 
flows 


Gluconeogenesis, glycogenolysis, glycolysis, 
gluconeogenesis, glucose fransport 


Set point 


An optimal value of the regulated variable; divergence from set 
point value activates homeostatic control mechanisms 


Normoglycemia (~5 mM) 


Error value |X-X’| 


The difference befween the set point and the actual value of the 
regulated variable 


Difference between actual blood glucose 
concentration and normoglycemia 


Controller 


A component of the homeostatic circuit that monitors the value of 
regulafed variable 


Pancreatic a. and 3 cells 


Plant 


An effecfor componenf of the homeostatic circuit that is activated 
by the Controller to change the value of regulated variable 


Skeletal muscle, white adipose tissue, brown 
adipose tissue, liver 


Controller gain 


A characteristic of Controllers that determines the amount of 
signal produced in response to given error value X-X’ 


Amount of insulin produced by p-cells in 
response fo a given blood glucose level 


Gain tuning of Controller 


A method to optimize Controller performance 


Changing the amount of insulin produced in 
response fo a given blood glucose level 



the actual value of the variable to a desired value or “set point” 
(Hardy, 1953-1954). 

Homeostasis is a unifying theme of modern physiology and 
much has been elucidated about molecular mechanisms of ho- 
meostatic control. However, the term, being intuitively simple, 
is often used loosely. For the purpose of this discussion, it is 
important to introduce and review some key definitions and con- 
cepts initially developed in control theory and systems dynamics 
theory, but applicable to homeostatic control in biological sys- 
tems (see Table 1 for glossary). 

First, it is important to distinguish two types of variables that 
exist in homeostatic systems. The physiological variables that 
are maintained at a stable level, such as blood glucose or core 
body temperature, are called regulated variables. In contrast, 
controlled variables are activities, or rates, of the processes 
that contribute to the stability of regulated variables (Cabanac, 
2006). For example, blood calcium concentration is a regulated 
variable, whereas the rate of urinary calcium excretion is a 
controlled variable that is manipulated in order to regulate blood 
calcium concentration. Multiple controlled variables typically 
contribute to the stability of a given regulated variable. Thus, in 
addition to calcium excretion in the kidney, the rates of intestinal 
calcium absorption and bone resorption are also controlled vari- 
ables that contribute to the maintenance of stable blood calcium 
concentration. In the case of blood glucose concentration 
(a regulated variable), the controlled variables include the rates 
of intestinal and renal glucose transport, glycogenolysis, gluco- 
neogenesis, glycolysis, glycogenosis, and glucose transport 
from the blood into tissues. Thus, regulated variables refer to 
quantities, whereas controlled variables refer to processes, 
where process activity or rate is a variable. Put in systems dy- 
namics terms, regulated variables are the stocks of the system, 
while controlled variables are the flows of the system: they either 



increase (in-flows) or decrease (out-flows) the value of the regu- 
lated variable (Figure 1). Notably, while all regulated variables are 
stocks, not all stocks are regulated variables. For example blood 
glucose is a regulated variable, whereas blood alcohol is not. 
Likewise, all controlled variables are flows, but not all flows are 
controlled variables. Thus heat loss through sweating is a 
controlled variable, while heat loss through conduction is not. 
Because these terminologies capture different aspects of sys- 
tem behavior we will use both during this discussion, to empha- 
size the relevant features of homeostasis. 

In order to be maintained within the desired range, the values 
of regulated variables must be continuously monitored and 
adjusted. Accordingly, all homeostatic systems have two essen- 
tial components: Controllers and Plants. The Controllers monitor 
the value of the regulated variable (X), compare it to the reference 
value (or in Hardy’s terms, set point) (X”), and generate a 
signal that is proportional to the absolute value of the difference 
|X - X’l (the coefficient of proportionality is a characteristic known 
as the Controller’s gain) (Astrom and Murray, 2008). This signal 
then acts on the Plant— the effector that creates flows into or 
out of the system — in order to bring the regulated variable closer 
to the reference value (Figure 2A). In a classic engineering 
example of a control system, the thermostat (Controller) com- 
pares the actual room temperature (regulated variable) to the 
desired room temperature (reference value, or set point). If actual 
room temperature is lower than the set point, a signal is 
generated and sent to the furnace (the Plant) to increase heat 
production (the flow) and raise room temperature toward the 
set point value. In physiology, the Controllers are typically endo- 
crine cells and sensory neurons of the autonomic nervous sys- 
tem, lower brainstem (medulla), and hypothalamus (Hammel, 
1 968). They monitor deviations in regulated physiologic variables 
from their set points and generate signals (hormones and 
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Figure 1. Stock and Flow Model of Homeostasis 

(A) Stock and flow model highlights two types of variables in homeostasis: 
Stock is quantity of a regulated variable - a parameter that is maintained by 
homeostasis. Flows are the processes that change the value of the stock. 
Some, but not all flows are controlled variables and targets for homeostatic 
control signals (graphically represented here as dials). Clouds represent 
“sources” and “sinks” for regulated variable that are extrinsic to the homeo- 
static system. 

(B) A physiologic example of stock and flow model: dietary glucose uptake, 
hepatic glucose production, or glucose uptake into adipose and muscle are 
flows that maintain the stock of blood glucose. 

neurotransmitters) that increase or decrease the fiows created 
by various Piants (tissues and organs that can adjust these 
vaiues) (Figure 2B). For exampie, pancreatic p-ceils (Controiler) 
produce insuiin in response to an increase in biood giucose 
(reguiated variabie). Insuiin acts on skeietai muscie, adipose tis- 
sue, and liver (the Piants) to increase giucose uptake and utiiiza- 
tion (out-fiows) in muscie and fat and to inhibit giuconeogenesis 
(in-fiow) in the iiver, thereby reducing piasma giucose ievei 
(Figure 2C). 

Controiiers and Piants are defined with respect to specific 
reguiated variabies. For exampie, pancreatic a- and p-ceils are 
Controllers for blood glucose, but not for body temperature, 
whereas adipose tissue and liver are Plants for blood glucose, 
but not for blood calcium (where the relevant Plants are the kid- 
ney, intestine, and bone). Additionally, most tissues and organs 
perform many functions and can therefore act as Plants for mul- 
tiple regulated variables, depending on the requirements of the 
organism: because skeletal muscle can both consume glucose 
and generate heat during shivering thermogenesis, it can act 
as a Plant for both blood glucose and body temperature. Thus, 
Controllers are characterized by the regulated variables they 
monitor, while Plants are characterized by the controlled vari- 
ables (activities of the flows) associated with them. 



Figure 2. Homeostatic Control Circuit 

(A) Basic homeostatic control circuits have two essential components: Con- 
trollers and Plants. Controllers monitor the value of regulated variable (X) and 
compare it to the reference value (XT In response to deviation of X from X', 
Controllers generate a signal (S) that acts on Plants. Plants are the effectors of 
the homeostatic systems that change the value of the regulated variable. 

(B) A physiologic example of control circuit: pancreatic beta cells act as 
Controller, sensing elevated blood glucose and producing insulin (signal S) to 
increase glucose uptake into skeletal muscle (Plant). In the simplest model, the 
output of the Controller (signal S) is proportional to the deviation of regulated 
variable from the reference value, |X-X’|. The proportionality constant is 
referred to as the gain. 

(C) Combining stock and flow modeling with the basic control circuit provides a 
more complete model of homeostasis. The Controller monitors the value of the 
Stock and produces signals that act on Plants. Such signals cause Plants to 
modulate the flows that contribute to the Stock. In this example, glucose 
sensing by the pancreas (Controller), induces glucagon or insulin secretion 
(Signals S’ and S”), which act on liver and muscle (Plants), to control glucose 
production and uptake, respectively (flows) and stabilize blood glucose 
(Stock). 

Fifty years after its inception, there is stiii disagreement over 
Flardy’s concept of set point, which in his modei was anaiogous 
to the reference value of engeneered systems. Some argue that 
reguiated variabies can reach steady state or “settiing point” 
without an externai reference point (Wirtshafter and Davis, 
1 977). In stock and flow terms, the stock would not be regulated 
by comparison to a set point, but simply reach a passive “settling 
point” when in-flows and out-flows balance. In other words, one 
can think of set point as being either a predefined, or an emer- 
gent characteristic of a system. A full discussion of the strengths 
and limitations of these two models is beyond the scope of this 
article. Flowever, the two models may not necessarily be mutu- 
ally exclusive (Speakman et al., 2011). Regardless of whether a 
reference point is real or imaginary, the term set point, if nothing 
else, is a convenient shortcut by which to refer to the defended 
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Figure 3. Homeostatic Units 

(A) System stock, Plant stock and Storage stock each represent homeostatic 
units that are connected by flows. Each of the stocks is monitored by a 
specialized Controller, which regulates the flows into and out of the stock. 
Homeostatic system is thus hierarchically organized into “nested” homeo- 
static units. 

(B) Physiologic example of nested homeostatic units: System stock (blood 
glucose) is monitored by System Controller (pancreatic p-cells). Plant stock 
(glucose in skeletal muscle) is monitored by Plant specific Controller (e.g., 
AMPK) and Storage stock (muscle glycogen) is presumably monitored by a 
glycogen sensor, which is currently unknown. Each of the Controllers regu- 
lates the flows into and out of the corresponding stock. 

level of a regulated variable and will be used herein for simplicity. 
For the sake of this discussion, it should not be thought of 
as equivalent to the external reference value In engeneered 
systems. 

Homeostatic Units 

Homeostasis has been studied primarily with regard to system- 
Ically regulated variables such as plasma glucose level and core 
body temperature. However, many of the same variables are 
also homeostatically maintained at the level of Individual cells 
within tissues. Such variables are referred to as System stocks 
when they are maintained at the systemic level and Plant stocks 
when they are maintained at the level of individual Plants. 
Thus, while blood glucose (System stock) Is maintained by Insu- 
lin, glucagon, and catecholamines, glucose level In skeletal mus- 
cle (Plant stock) Is simultaneously monitored by Intracellular 
sensors and homeostatically maintained through regulated 
expression of glucose transporters and activity of metabolic 
pathways of glucose utilization (Herman and Kahn, 2006; Jensen 
et al., 2008). On the organismal level, pancreatic p-cells function 
as Controllers and skeletal muscle as Plants. Within Individual 
myocytes, AMPK functions as a Controller (monitoring Intracel- 
lular glucose level) and GLUT4 (a glucose transporter) functions 
as a Plant. The signal connecting Controllers to Plants In this 
case Is the signaling pathway connecting AMPK to GLUT4 
expression. Note that System stock and Plant stock are con- 
nected by a flow (e.g., glucose transport from blood Into skeletal 



muscle by GLUT4) (Figure 3). GLUT4 expression and glucose 
flow can be controlled by both the system level Controller (in 
this case, by insulin) and by the tissue level Controller (In this 
case, by AMPK). In exercising muscle, for example, glucose 
and ATP depletion leads to AMPK activation, prompting insu- 
lin-independent glucose uptake (a tissue-level control) even 
when Insulin-stimulated uptake might be suppressed (a sys- 
tem-level control) (Herman and Kahn, 2006; Russell et al., 
1999). Conversely, when skeletal muscle energy stores are 
high. Insulin-dependent glucose uptake Is Inhibited, as Illustrated 
by insulin resistance that can be caused by fatty acid accumula- 
tion in the muscle (Samuel and Shulman, 2012) or by activity of 
the hexosamlne biosynthetic pathway (Ruan et al., 2013). 

Some Plant stocks have a special property: glycogen In the 
liver and muscle, triglycerides in the adipose tissue, and calcium 
phosphate in the bone are examples of Storage stocks. They 
buffer regulated variables (blood glucose, fatty acids, and cal- 
cium) from the variations in dietary intake or expenditure. The 
System stocks (e.g., blood glucose). Plant stocks (muscle 
glucose) and Storage stocks (muscle glycogen) are connected 
by In- and out-flows (glucose transport, glycogenolysis, and 
glycogenesis), which are adjusted by hormones and neurotrans- 
mitters to maintain the System stock within a desired range 
(Figure 3). The relationship between regulated stocks and stor- 
age stocks Is analogous to the relationship between pocket 
money and money In a bank account: they are connected by 
flows (deposits and withdrawals) and while the former is usually 
maintained within a relatively narrow range, the latter Is not. Stor- 
age stocks exist for some regulated variables (glucose, fatty 
acids, vitamin A, calcium), but not for others (oxygen, sodium, 
potassium). Accordingly, the latter variables are more vulnerable 
to fluctuations In environmental availability. 

As noted earlier. Plants are defined by the regulated variables 
they maintain. The notion of the Plant Is only relevant with 
respect to a specific homeostatic circuit. When skeletal muscle 
Is referred to as a Plant In glucose homeostasis, it is specifically 
Its activities In glucose handling that are relevant. In that sense 
the terms “Plant” and “Tissue” are not equivalent. All tissues 
have their own homeostatic circuits that may or may not be 
related to their function as Plants or Controllers. Like any homeo- 
static systems, tissues have their own regulated and controlled 
variables. Oxygen and nutrient concentration, interstitial fluid 
volume, pH, osmolarlty, cell number, and cellular composition 
are all examples of regulated variables of tissue homeostasis 
(Chovatiya and Medzhitov, 2014). Cell proliferation, apoptosis 
and migration, lymphatic drainage, and vascular permeability 
are examples of controlled variables. Typical Controllers include 
tissue resident macrophages, mast cells, and somatosensory 
neurons, all of which monitor various regulated variables of tis- 
sue homeostasis. Finally, many cells within tissues (Including 
vascular and lymphatic endothelium, stromal, and parenchymal 
cells) can act as Plants, depending on the controlled variable 
(Chovatiya and Medzhitov, 2014). 

As noted earlier, some regulated variables, for example, 
glucose, are homeostatically maintained as System stock (blood 
glucose). Plant stock (muscle glucose), and Storage stock (mus- 
cle glycogen). All three stocks are connected by flows. However, 
not all regulated variables are connected In this manner: for 
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example, protein concentration in a cell and in plasma are both 
regulated variables, but they are not connected by flows; 
collagen stiffness/elasticity is a regulated variable of tissue ho- 
meostasis, but it does not even have a counterpart at cellular 
or organismal levels. When a regulated variable is maintained 
by homeostatic circuits at multiple levels that are connected by 
flows, the result is interdependent, “nested” homeostatic units 
(Figure 3). This hierarchical organization of homeostasis pro- 
vides buffering and flexibility in addressing systemic and tis- 
sue-specific physiologic needs and priorities. 

Controllers as Sensors of Regulated Variables 

Controllers play a key role in homeostasis by monitoring the 
values of the regulated variables. There are two methods used 
by Controllers to perform this function. Some Controllers monitor 
the values of regulated variables through a flow that samples the 
System stock. As an example, jf-cells monitor blood glucose 
level by transporting glucose through GLUT2 transporter and 
converting it by glucokinase into glucose-6-phosphate (G6P) to 
initiate glycolysis (Olson and Pessin, 1996). ATP generated by 
glycolysis then inhibits the ATP-sensitive potassium channel re- 
sulting in plasma membrane depolarization, calcium influx, and 
insulin secretion (Newgard et al., 2002; Newgard and McGarry, 
1995). The flow of glucose into jf-cells has special features that 
enable glucose sensing: First, GLUT2 has a very high Km for 
glucose (15-20 mM) and only transports glucose when its level 
in the blood is high (Burant and Bell, 1 992). Similarly, glucokinase 
has a low affinity for glucose compared to other hexokinases 
(Matschinsky, 1996). These properties make the p-cell sensitive 
to high plasma glucose level. Second, the flows into Controllers 
are not subject to inhibition by negative feedback, unlike the 
flows into Plants. Thus, glucokinase, unlike hexokinases, is not 
inhibited by G6P (Matschinsky, 1996); otherwise the amount of 
ATP generated by glycolysis would not be proportional to the 
amount of glucose transported into the p-cells. 

An alternative means by which to monitor the system stock is 
through dedicated receptors. For example, sensory neurons 
typically use various gated channels and other sensors to 
monitor temperature (e.g., TRMP8 and TRPV1), pH (ASICS), 
oxygen (p 02 sensor in glomus cells of carotid body), and stretch 
sensors in baroreceptors (Krishtal, 2003; Montell, 2005; Prabha- 
kar, 2000). Many metabolites, for example, fatty acids and ke- 
tones, can be monitored both directly by GPCRs (Briscoe 
et al., 2003; Oh et al., 2010) and through their flow into Control- 
lers where they are metabolized. 

Physiological Priorities 

As Cannon aptly noted when selecting the prefix homeo, or 
similar, rather than homo, same (Cannon, 1929), homeostatic 
variables are not maintained at a constant level, but rather within 
a certain range of values. Some physiological variables (e.g., 
plasma glucose) are tolerated over a relatively wide dynamic 
range, while others must remain within a narrow range (e.g., 
plasma calcium). Moreover, the same regulated variable can 
have a different acceptable dynamic range in different tissues: 
for example, the brain has low tolerance to deviations in many 
physiologic variables (including oxygen, glucose, and tempera- 
ture) while white adipose tissue is typically less demanding. 



Thus, the most sensitive tissues both define the limits of homeo- 
static range for the corresponding regulated variables and tend 
to be better protected from the fluctuations in these variables. 
For example, the brain is relatively insulated from the normal vari- 
ation of blood glucose levels (ranging between 4 mM and 7 mM) 
due to the neuronal expression of the high-affinity glucose trans- 
porter GLUT3, which has a low Km for glucose (^^1 mM) (Burant 
and Bell, 1992). 

Homeostatic prioritization is also reflected in the contribution 
of the different Plants to the maintenance of the regulated vari- 
able. As eluded to earlier, a given regulated variable can be 
affected by multiple Plants. For example, blood glucose level 
can be affected by muscle, liver, adipose, kidney, and intestine 
through uptake, metabolism, and excretion. The relative contri- 
butions of different Plants to blood glucose level need to be co- 
ordinated to minimize fluctuation of the stock. Thus, increased 
glucose consumption by exercising skeletal muscle can be 
compensated for by decreased consumption by the adipose tis- 
sue and/or by increased gluconeogenesis by the liver. While all 
three Plants can affect the value of the regulated variable (in 
this case glucose), their relative contributions can change de- 
pending on their functional states and physiological priorities of 
the organism. The corollary to this feature is that increased 
flow burden is dynamically allocated between different Plants, 
which in turn necessitates communication between Plants to co- 
ordinate their contributions to systemic homeostasis, as we 
discuss next. 

Homeostatic Control Signals 

The classical view of homeostasis is that it is maintained by sig- 
nals from the endocrine and autonomic nervous systems. 
Recent discoveries have extended this paradigm by demon- 
strating that signals produced by tissues and organs not histor- 
ically thought of as endocrine organs— including adipose tissue, 
the intestine, the liver, the muscle, and the kidneys— also play 
critical roles in homeostatic control. Examples of these signals 
include the adipokines leptin (Friedman and Halaas, 1998), adi- 
ponectin (Yamauchi et al., 2001), and RBP4 (Yang et al., 2005); 
the hepatokine FGF21 (Fisher et al., 2011); the myokines IL-6 
(Pedersen and Febbraio, 2012) and meteorin-like (Rao et al., 
2014); and the gut hormones FGF15/19 (Potthoff et al., 2011), 
CCK (Gibbs et al., 1973), and GLP-1 (Holst, 2007). While the 
mechanisms of action of many of these signals are still being 
elucidated, one could argue that not all signals are equivalent 
in the type of information they communicate within a homeo- 
static circuit. 

As discussed above, there are two types of variables in ho- 
meostasis: stocks and flows. The stocks can be further divided 
into System stocks (e.g., plasma glucose). Plant stocks (e.g., 
muscle glucose), and Storage stocks (e.g., muscle glycogen). 
We propose that each type of stock and flow is monitored and 
translated into a distinct class of homeostatic signals that reports 
on their value (Figure 4), giving rise to four classes of homeostatic 
signals: 

(1) Signals of the first class are produced by System Control- 
lers and report on the value of the System stocks (Signal 
Sa in Figure 4). These are classical endocrine hormones 
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Figure 4. Four Classes of Signals Control 
Systemic Homeostasis 

(A) Four classes of homeostatic signals report on 
values of four different types of variables: System 
stock (regulated variable), Plant stock, Storage 
stock and Flows. Each stock and the flows are 
monitored by dedicated Controllers and sensors. 
All four categories of homeostatic signals modu- 
late gain tuning of Controllers and flow tuning in 
Plants. Signals that report on stocks operate in 
feed-back loops. Signals that report on flows op- 
erate in feed-forward loops. 

(B) Signals reporting on the System stock (Sg) are 
classical endocrine hormones and efferents of the 
autonomic nervous system (e.g., insulin and 
glucagon). Signals reporting on Plant stocks (Si,) 
primarily operate in a cell or tissue autonomous 
manner (e.g., AMPK controlling GLUT4 expres- 
sion), but may include signals acting systemically 
(e.g., AMPK controlling IL-6 expression in exer- 
cising muscle). Signals reporting on Storage 
stocks (SJ indicate available resources (e.g., lep- 
tin reporting on fat stores). Finally, signals report- 
ing on Flows (Sd) indicate anticipated changes in 
the System stock (e.g., GLP-1 reporting on 
incoming glucose). The examples are chosen to 
illustrate the point. 



dietary iron uptake and prevent 
iron overioad (Nemeth et al., 
2004). Signais reporting on avail- 
abie giycogen stores are not 
known but are iikeiy to exist. Sig- 
nais of this ciass aiso participate 
in negative feedback circuits. 

(4) Signais of the fourth ciass report 
the vaiues of fiows (Signai Sd in 
Figure 4). For exampie, the gut 



and efferents of the autonomic nervous system that oper- 
ate in negative feedback ioops. Exampies inciude insuiin 
and giucagon reporting on piasma giucose ievel, or para- 
thyroid hormone reporting on plasma calcium level. 

(2) Signals of the second class report the value of the Plant 
stocks (Signal St, in Figure 4). Plant stocks are monitored 
by cell or tissue specific Controllers, such as AMPK, 
mTOR, HIF-1a, stretch receptors and many others. These 
sensors generate negative feedback signals that control 
the flows into Plant stocks in a cell or tissue autonomous 
manner (such as the example of insulin-independent 
glucose uptake in exercising muscle, described above). 



hormone, GLP-1 , reports on dietary glucose inflow, and 
therefore anticipates rising systemic glucose stock 
(which is itself reported by insulin) (Flolst, 2007). CCK 
and NAPEs (A/-Acylphosphatidylethanolamines) similarly 
report on dietary fat inflow and reduce appetite to sup- 
press further inflow (Gibbs et al., 1973; Gillum et al., 
2008). FGF21 is produced by hepatocytes during 
fasting (Badman et al., 2007; Inagaki et al., 2007) and 
potentially reports on flow of fatty acids from the adipo- 
cytes during lipolysis. FGF21 expression in the liver is 
induced by fatty acids through PPARot (Potthoff et al., 
2012). One might speculate that while PPARy sensing of 



Additionally, Plants produce signals that control the flows 
in a systemic manner. Signals of this category include 
various myokines, such as IL-6 and meteorin-like (Peder- 
sen and Febbraio, 2012; Raoet al., 2014), which appear to 
report on fuel depletion in muscle. 

(3) Signals of the third class report the value of Storage 
stocks (Signal Sc in Figure 4). For example, leptin reports 
on the available fat storage in adipose tissue, and there- 
fore controls food intake (caloric inflow) and energy 
expenditure (caloric outflow) (Friedman and Flalaas, 
1998). Hepcidin, similarly, reports on the storage stock 



fatty acids in adipose tissue is an indicator of the inflow 
into the fat storage stock (taking place during feeding- 
associated lipogenesis), PPARa sensing of fatty acids 
in the liver is an indicator of the outflow from the 
storage stock (taking place during fasting-induced lipol- 
ysis). One important feature of signals that report on 
flows is that they typically operate in a feed-forward 
fashion. Because a change in a flow is predictive of the 
subsequent change in the stock, the signal reporting 
on an increased inflow, for example, would be expected 
to increase the outflow and inhibit other inflows of the 



of iron in the reticuloendothelial system in order to inhibit same stock. This is in contrast to signals that report on 
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the System, Plant, and Storage stocks, which all operate 
in a feedback fashion to maintain the stock within an 
acceptable range. 

Monitoring the flows enables the system to minimize time de- 
lays that are unavoidable in negative feedback systems. React- 
ing to changing flows elicits an anticipatory response that 
makes the homeostatic system more robust to environmental 
fluctuations and helps to prevent dramatic changes in the 
stock. For example, intestinal glucose in-flow reporting by 
GLP-1 helps to prevent dramatic postprandial glucose spikes 
that would be unavoidable if only stock (blood glucose) report- 
ing by insulin were available. Not every flow in the system needs 
to be monitored and reported as a signal. Presumably, only the 
flows that have a major impact on the system’s stock are moni- 
tored, particularly the flows that operate at the interface with the 
environment (for example, in the intestine, liver, kidney, lungs, 
and skin). 

The four categories of signals outlined above are defined by 
the homeostatic variables they report on. The effects of homeo- 
static signals fall into three categories: First, homeostatic signals 
directly regulate the flows of the system: for example, insulin 
suppresses hepatic gluconeogenesis. Second, homeostatic 
signals can change the sensitivity of the flows to another homeo- 
static signal: for example, placental hormones and glucocorti- 
coids reduce the sensitivity of target tissues to insulin. Third, 
homeostatic signals can change the gains of the Controllers. 
For example, GLP-1 increases and leptin decreases the gain of 
the pancreatic p-cells - they change the amount of insulin pro- 
duced in response to a given level of blood glucose. Thus, in 
addition to adjusting the flows of Plants, homeostatic signals 
can change the gains of Controllers. 

In summary, a complex array of signals reporting on available 
stocks and flows allows Controllers to coordinate multiple Plants 
toward regulation of a homeostatic variable, while simulta- 
neously balancing the needs and capabilities of individual Plants. 
Thus, application of the “stock and flow” model provides a 
framework for functional classification of homeostatic signals 
and extends the traditional model of homeostasis, which is 
focused exclusively on Controller-to-Plant signals. 

Adjustable Set Points and Homeostatic Adaptation 

Homeostatic circuits can be broadly divided into two classes— 
those that have a single fixed set point and those with multiple 
or adjustable set points. The fixed set point circuits are charac- 
teristic of regulated variables that have a narrow dynamic range, 
such as arterial [PO 2 ] or blood calcium concentration. Homeo- 
static systems with fixed set points are regulated solely by 
changing the flows, such as calcium resorption, excretion, stor- 
age, and utilization. The adaptability of systems with a single set 
point is limited by the homeostatic range of the regulated vari- 
able; when the regulated variable deviates beyond the accept- 
able range (for example in extreme environments when the 
buffering capacity of the system is overwhelmed), the system 
can undergo catastrophic pathological changes. The failure of 
one homeostatic circuit may lead to a disruption of other con- 
nected circuits, resulting in particularly dangerous scenarios of 
cascading failures, as seen, for example, in sepsis. 



In some cases, the changing environment or physiologic de- 
mands cannot be accommodated by homeostatic circuits with 
a fixed set point. In these cases, adjustable set points can be em- 
ployed to maintain regulated variables within different dynamic 
ranges and enable more efficient adaptation to varying de- 
mands. This ability to maintain conditions “at changing rather 
than similar levels or values” has been referred to as rheostasis 
(Mrosovsky, 1990). 

There are several examples of homeostasis with variable set 
points. Among the most obvious is fever, where the set point 
for core body temperature rises and is maintained at a higher 
level (as opposed to hyperthermia, where homeostatic mecha- 
nisms are engaged to return the temperature to the default set 
point). An extreme example of set point change is seen during hi- 
bernation: normally, ground squirrels exhibit an average daily 
body temperature near 37°C. During hibernation, however, their 
temperature may fall below 0°C and metabolic rate is dramati- 
cally suppressed (Barnes, 1989). This extreme physiologic 
switch is thought to permit adaptation to conditions of food scar- 
city that would be incompatible with life if the squirrels main- 
tained their normal metabolic and temperature set points. 
Similarly, in human pregnancy, many physiologic parameters 
such as blood pressure, blood glucose, total body water, and 
adiposity are dramatically altered in order to meet the needs of 
the fetus (King, 2000). These set point adjustments can occur 
even in a stable environment and reflect the adaptation to chang- 
ing physiological priorities. Thus, a variety of environmental fac- 
tors and changing physiological priorities, including seasonal 
and circadian changes, reproductive status (puberty and preg- 
nancy), stress, nutrition, and infection, require homeostatic 
adaptations which in some cases appear to involve set point 
adjustments. 

The change of the set points can occur in two different ways, 
depending on whether the set point-adjusting stimulus has to be 
continuously present to maintain a new set point value. The 
change of the body temperature set point during fever is induced 
by prostaglandin PGE2, which acts on thermoregulatory hypo- 
thalamic neurons (Romanovsky et al., 2005). As soon as inflam- 
mation subsides (or PGE2 production is blocked by COX2 
inhibitors), the temperature set point changes back to the original 
value of 37°C. Thus, in this case, the continuous presence of 
PGE2 is required to maintain the altered set point for body tem- 
perature. The implication of this is that although all set points are 
defended, not all set points are equally stable: 37°C is the default 
set point for human body temperature, whereas set points 
induced by fever are not. As soon as the inducing stimulus sub- 
sides or is blocked, the system switches back from the induced 
set point to the default set point. This design feature provides a 
failsafe to prevent permanent and pathological shifts in the set 
point by requiring persistent stimulation. In contrast, the set point 
for human body weight appears to be maintained at multiple 
alternative stable states. The homeostatic systems that have 
alternative stable states without a default set point are particu- 
larly vulnerable to dysregulation, as we discuss next. 

Set Points and Diseases of Homeostasis 

In contrast to circuits with fixed set points, which are generally 
robust to perturbations, homeostatic circuits with adjustable 
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set points are vulnerable to dysregulation precisely because they 
are designed to be adjustable. For example, the adjustable set 
point for body weight and adiposity allows for adaptation to 
times of food abundance or scarcity, as well as to the accumu- 
lation of fuel stores to feed a growing fetus. However, in the 
setting of the modern environment, adjustable set points may 
have contributed to the current obesity epidemic (Speakman 
et al., 2011; Woods and Ramsay, 2007). If body adiposity had 
a fixed set point value, obesity would be impossible except for 
purely genetic reasons. In fact, most tissues other than visceral 
fat, have a single set point value for their size control as a function 
of body size, which is why they are not subject to homeostatic 
dysregulation. Like adiposity, glucose, and lipid homeostasis 
are characterized by adjustable set points, while amino acid 
and purine/pyrimidine metabolism appear to have a single set 
point; accordingly, the former are vulnerable to homeostatic dys- 
regulation while the latter are not. 

One disease state particularly interesting from this perspective 
is insulin resistance. Insulin’s best-known function is to stimulate 
glucose uptake by skeletal muscle and adipose tissue, thereby 
reducing glycaemia. However, it is now appreciated that insulin 
has myriad effects, orchestrating a coordinated anabolic effort 
by liver, skeletal muscle, and white adipose tissue to convert 
glucose and fatty acids into glycogen and triglycerides, respec- 
tively, to export these when necessary for storage in the appro- 
priate organ, and to suppress the mobilization of stored fuels 
(Schenk et al., 2008; Shulman and Petersen, 2011). In addition, 
insulin induces a trophic response in many cell types that pro- 
motes protein synthesis, and consequently cellular and tissue 
growth (Shulman and Petersen, 2011). Interestingly, not all of 
these functions are reduced during the insulin resistant state 
(Brown and Goldstein, 2008), nor are all organs equally affected. 
Thus, insulin resistance is not equivalent to reducing the quantity 
of insulin in the blood, but rather is a method of physiologic set 
point adjustment that allows the organism to reallocate re- 
sources between different tissues. 

Insulin sensitivity can be changed in many altered physiologic 
states. During pregnancy, critical illness, infection, and stress, 
insulin responsiveness is diminished, presumably to allocate re- 
sources toward a growing fetus, tissue repair, or the immune 
system, respectively (Odegaard and Chawla, 2013; Power and 
Schulkin, 2012; Waive and Yajnik, 2007). Conversely, insulin 
sensitivity is heightened during caloric restriction and weight 
loss, perhaps to increase anabolic efficiency. 

Unfortunately, the adjustability of the insulin sensitivity set point 
also makes it vulnerable to disease. Insulin resistance is widely 
accepted as the pathological precursor for diabetes, a dangerous 
potential complication of obesity. Thus, the very mechanisms that 
evolved to make insulin receptor sensitivity adjustable also enable 
pathological insulin resistance. The same argument applies to 
other homeostatic systems with multiple set points that corre- 
spond to alternative stable states— they are vulnerable to dysre- 
gulation because they are designed to be adjustable. 

As noted above, some homeostatic systems with multiple set 
points have a default set point value and any change of set point 
has to be actively maintained. Such systems, including control of 
body temperature, are generally less vulnerable to dysregulation 
because alternative set points are not stable. 



Inflammation and Homeostatic Circuits 

Inflammation is a protective response to extreme challenges to 
homeostasis, such as infection, tissue stress, and injury. Inflam- 
matory signals— including cytokines, chemokines, biogenic 
amines, and eicosanoids, induce myriad changes in diverse bio- 
logical processes, ranging from local vascular responses to 
alterations of body temperature. Despite this complexity and di- 
versity of functions, all the activities of inflammatory signals can 
be described in terms of their effects on homeostatic circuits: 
First, inflammatory signals can directly stimulate or inhibit the 
flows of various homeostatic systems. For example, TNF and 
IL-1(3 activate lipolysis, inhibit gluconeogenesis, and increase 
vascular permeability to fluids and solutes, while IL-6 changes 
hepatic protein synthesis (Medzhitov, 2008). Second, in addition 
to directly affecting the flows, inflammatory signals can change 
the sensitivity of the Plants to homeostatic signals. For example, 
TNF makes liver, fat, and skeletal muscle less sensitive to insulin 
(Hotamisligil et al., 1993; Weisberg et al., 2003). Third, inflamma- 
tory signals can change the gain of the Controllers. For example 
TNF and IL-1 13 suppress expression of GLUT2 and glucokinase 
in pancreatic p-cells, thus making them less sensitive to the 
blood glucose level (Park et al., 1 999). Consequently, p-cells pro- 
duce less insulin given the same amount of plasma glucose— an 
example of gain tuning of the Controller. As discussed above, 
homeostatic signals also operate by directly regulating flows, 
by changing sensitivity of Plants to other homeostatic signals, 
and by gain-tuning of Controllers. Thus homeostatic and inflam- 
matory signals employ identical methods to change the same 
homeostatic variables (Figure 5). 

Importantly, the inflammatory mediators are both antagonistic 
to and dominant over homeostatic signals. They are antagonistic 
because normal homeostasis is often incompatible with the 
goals of the inflammatory response, and the former has to be 
temporarily disengaged. Inflammatory signals are dominant 
because they have higher physiological priority as they orches- 
trate the protective response to life threatening insults of infec- 
tion and injury. Thus, homeostatic control of body temperature 
(thermogenesis or sweating) is normally induced by changes in 
ambient temperature. However, acute inflammation overrides 
this control by raising the set point of body temperature, thereby 
inducing thermogenesis and fever regardless of ambient tem- 
perature. Likewise, acute inflammation-induced anorexia sup- 
presses caloric intake regardless of the adiposity, circulating 
nutrient concentrations, or body weight. 

It is increasingly appreciated that chronic inflammation is an 
important component of numerous disease states including 
obesity, type 2 diabetes, atherosclerosis, asthma, and neurode- 
generative diseases. One potential mechanism by which inflam- 
mation may initiate or perpetuate disease is through set point 
changes. In obesity, for example, macrophages and other cells 
of the immune system infiltrate adipose tissue in response to 
the increased burden of lipid accumulation and adipocyte stress 
(Hotamisligil and Erbay, 2008; Weisberg et al., 2003). These cells 
produce inflammatory cytokines that are capable of shifting ho- 
meostatic set points in states of chronic inflammation, just as 
they do in acute inflammatory states. The rationale for transiently 
adjusting the insulin responsiveness in acute inflammation is 
presumed to be in shifting nutrient allocation from tissues that 
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Figure 5. Inflammatory Signals and Homeostasis 

(A) Inflammatory signals (IS) act through the same control points (Plants flows 
and Controller gains) as homeostatic signals (HS). To illustrate the parallels 
between homeostatic and inflammatory signals, the source of inflammatory 
signal is referred to as Inflammatory Controller (e.g., macrophage), by analogy 
to Homeostatic Controller (e.g., endocrine pancreas). 

(B) Macrophages produce TNF and IL-1 which act on the same flows as insulin, 
but in opposite direction: TNF and IL-1 induce insulin resistance and suppress 
lipid storage in adipose tissue by inhibiting lipoprotein lipase. In addition, these 
cytokines induce gain tuning of the pancreatic p-cells to reduce the amount of 
insulin produced in response to a given level of blood glucose. This effect is 
achieved in part by suppressing glucose flow into p-cells. 

have lower priority during infection (adipose and skeletal muscle) 
toward the higher priority immune defenses (Hotamisligil and 
Erbay, 2008). In obesity, chronic inflammation may contribute 
to the shift of insulin sensitivity to an alternative set point. 

Inflammation is a protective response that is engaged to 
defend and restore physiological functions when homeostatic 
mechanisms are insufficient. The inflammatory response can 
only achieve this goal by overriding or suppressing incompatible 
homeostatic controls. However, in its attempts to restore ho- 
meostasis, inflammation may enforce and propagate homeo- 
static set point changes that are detrimental and can result in 



chronic pathological states. This happens when a persistent 
change in the set point itself creates a problem sufficient to pro- 
mote inflammation. For example, hyperglycemia can lead to 
glucose toxicity and tissue damage, which in turn can lead to 
secondary inflammation. Similarly, the abnormal accumulation 
of harmful lipid mediators (lipotoxicity) in adipocytes, liver, and 
muscle in obesity leads to cellular stress and tissue dysfunction, 
and consequently to inflammation (DeFronzo, 2010; Samuel and 
Shulman, 2012; Summers, 2006). Thus, a homeostatic perturba- 
tion initially induced by lipotoxicity may be further perpetuated 
by inflammation. In such scenarios, a vicious cycle can ensue 
that may explain the chronicity of some homeostatic diseases 
and their perpetuation by inflammation. Such a model is consis- 
tent with data demonstrating that inflammation is dispensable for 
the initial induction of insulin resistance, but contributes to main- 
taining and even worsening insulin resistance in states of chronic 
obesity (Oh et al., 2012). 

Successful inflammatory response is followed by the resolu- 
tion phase that restores homeostasis. However, because inflam- 
mation is induced by loss of homeostasis, but also intentionally 
disrupts incompatible homeostatio processes, the system has 
the potential to become locked in a state of a chronic inflamma- 
tion that fails to resolve. The non-resolving inflammation may, in 
turn, account for the persistence of chronic diseases (Nathan 
and Ding, 2010; Serhan et al., 2007). It is therefore important to 
identify the mechanisms responsible for physiological shifts be- 
tween alternative stable states of the homeostatic systems, as 
the same mechanisms could be employed therapeutically to 
reverse pathological states in chronic diseases of homeostasis. 

Perspectives: Evolution, Adaptation, and Disease 

The concept of adaptability as vulnerability is pervasive in many 
forms of phenotypic variation, be they reversible (body weight) or 
irreversible (body height), continuous (reaction norms) or discon- 
tinuous (polyphenisms). Traits that are discontinuous are ex- 
pressed through one of several alternative developmental 
pathways, a phenomenon known as phenotypic plasticity (Dew- 
itt et al., 1998; Feinberg, 2007; Stearns and Koella, 2008). Such 
plasticity can allow for different phenotypes in the same organ- 
ism, and can therefore afford greater adaptability. The choice 
of a particular developmental pathway is dictated by anticipation 
of certain environments where these pathways and associated 
traits would provide greater adaptation. However, if the environ- 
ment is not as anticipated and the phenotypic choice is irrevers- 
ible, maladapted phenotypes susoeptible to disease may result 
(Dewitt et al., 1998; Feinberg, 2007; Stearns and Koella, 2008). 
Consequently, the mechanisms that afford greater adaptability 
can also create vulnerability to diseases (Bateson et al., 2004). 
Thus, phenotypic plasticity can be thought of as a develop- 
mental equivalent of homeostasis with alternative stable states 
dictated by adjustable set points. 

The homeostatic capacity of an organism determines Usability 
to adapt to varying environments. Homeostatic systems with 
fixed set points are inflexible but resistant to dysregulation. If 
their buffering capacity is overwhelmed, the consequences are 
likely to be catastrophic, acute, and transient, but rarely yielding 
chronic disease. Comparatively, homeostatic systems with 
adjustable set points provide a greater degree of adaptability. 
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but are vulnerable to dysregulatlon and disease when the set 
points of the system are changed Inappropriately, as often hap- 
pens during chronic inflammation. Thus, the flexibility and 
adjustment of physiological and developmental characteristics, 
while providing a benefit of more efficient adaptation, are also 
responsible for the diseases of homeostasis. Treatment and pre- 
vention of diseases of homeostasis therefore will require a better 
understanding of the mechanisms responsible for the switch be- 
tween developmental trajectories and homeostatic set points. 

Summary 

Here, we present a framework that highlights the fundamental 
connections between homeostasis and inflammation. This 
framework is based on concepts previously developed in control 
theory and system dynamics theory. The key points of the frame- 
work are summarized below: 

• Homeostasis maintains essential parameters of the sys- 
tem within acceptable range. These parameters are regu- 
lated variables or stocks of the system. The processes 
that change or maintain these parameters are known as 
flows. The activity of the flow is a parameter known as 
controlled variable. 

• Homeostatic systems have two components: Controllers 
and Plants. Controllers monitor the stocks while Plants op- 
erate the flows. 

• If the value of regulated variable (X) differs from the set 
point value {X’), Controllers produce signals (S) that act 
on Plants to change the relevant flows. 

• Controller output is proportional to the error value |X-X’|. 
The coefficient of proportionality is a characteristic known 
as Controller’s gain. 

• Controllers can have a combination of different gains: pro- 
portional gain corresponds to the present error value, inte- 
gral gain corresponds to the accumulated past error 
values, and differential gain corresponds to the anticipated 
future error value. The Controllers that have all three gains 
are known as PID (proportional, integral, differential) 
Controllers. 

• The gain of Controller can be tuned to change the setting 
of the system. In PID Controllers different gains can be 
tuned independently of each other to optimize system’s 
performance. 

• Homeostatic systems can have a single fixed set point, or 
multiple adjustable set points. The former are inflexible but 
robust to dysregulatlon. The latter are more adaptable but 
vulnerable to dysregulatlon. Chronic homeostatic diseases 
can result when the system becomes locked in an alterna- 
tive stable state. 

• Plants have their own stocks. A special case of Plant stock 
is Storage stock. Storage stocks buffer the System stock 
from external fluctuations. System stock. Plant stock and 
Storage stock are connected by flows. Stocks connected 
by flows form nested homeostatic units, where each stock 
is regulated coordinately with other connected stocks. 

• Homeostatic signals fall into four classes defined by the 
four types of homeostatic variables they report on: System 
stock. Plant stock. Storage stock and the flows. Each of 



these variables and the signals that report on them, pro- 
vide different information about homeostatic system: 
o System stock— information about the present value of 
regulated variable and its deviation from set point. Re- 
ported by classical endocrine hormones and efferents 
of the autonomic nervous system, 
o Plant stock— information about the homeostatic ca- 
pacity of individual Plants to maintain the System stock. 
Reported by non-endocrine tissue derived hormones, 
o Storage stock— information about the amount of re- 
sources available to the system. Some storage stocks 
may reflect the accumulated past deviations of System 
stock from set point. Reported by hormones produced 
by tissues that serve as depots for regulated variables, 
o Flows— information about the anticipated change in 
the System stock. Reported by hormones produced 
by tissues that operate flows with large impact on Sys- 
tem stock. 

• Homeostaticsignalsaffecttwotypesofvariables: Plantflows 
and Controller’s gains. In addition, the sensitivity of Control- 
lers and Plants to homeostatic signals can also be regulated. 

• Signals that report on Storage stock tune the integral gain 
of Controllers, whereas signals that report on flows tune 
the differential gain of Controllers. 

• Inflammatory signals target the same control points as the 
homeostatic signals: these are Plant flows and Controller’s 
gains. In addition to directly affecting these parameters, in- 
flammatory signals can modulate the sensitivity of Control- 
lers and Plants to homeostatic signals. 

• Inflammatory response aims to restore homeostasis, but to 
achive this goal it has to suppresses incompatible lower 
priority homeostatic processes. Therefore, inflammatory 
signals are antagonistic to the incompatible homeostatic 
signals. 

• Inflammatory signals are dominant over homeostatic sig- 
nals because they have higher priority. Physiological prior- 
ities determine the hierarchy of signals. 

• The parallels between homeostatic and inflammatory sig- 
nals suggest the evolutionary origin of inflammation as a 
control system that complements the homeostatic control 
when the latter is insufficient. 

• Inflammation can change homeostatic settings of a system 
by changing Controller’s gains and by overriding homeo- 
static signals. Inflammation commonly accompanies ho- 
meostatic diseases associated with set point changes. 
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SUMMARY 

Hunger is controlled by specialized neural circuits 
that translate homeostatic needs into motivated be- 
haviors. These circuits are under chronic control by 
circulating signals of nutritional state, but their rapid 
dynamics on the timescale of behavior remain un- 
known. Here, we report optical recording of the nat- 
ural activity of two key cell types that control food 
intake, AgRP and POMC neurons, in awake behaving 
mice. We find unexpectedly that the sensory detec- 
tion of food is sufficient to rapidly reverse the activa- 
tion state of these neurons induced by energy deficit. 
This rapid regulation is cell-type specific, modulated 
by food palatability and nutritional state, and occurs 
before any food is consumed. These data reveal that 
AgRP and POMC neurons receive real-time informa- 
tion about the availability of food in the external 
world, suggesting a primary role for these neurons 
in controlling appetitive behaviors such as foraging 
that promote the discovery of food. 

INTRODUCTION 

Food intake is controiied by evoiutionariiy hard-wired neurai cir- 
cuits that contain speciaiized neurai ceii types. Two ceii types in 
the arcuate nucleus (ARC) of the hypothalamus are known to be 
particularly important for the control of feeding. These neurons 
are identified by expression of the neuropeptides Agouti-related 
Protein (AgRP) and Proopiomelanocortin (POMC) and have 
opposing functions. AgRP neurons are activated by energy 
deficit (Hahn et al., 1998) and promote food seeking and con- 
sumption. Optogenetic or chemogenetic activation of AgRP neu- 
rons induces voracious eating in sated mice (Aponte et al., 201 1 ; 
Krashes et al., 2011), whereas inhibition or ablation of AgRP neu- 
rons results in aphagia (Gropp et al., 2005; Krashes et al., 201 1 ; 
Luquet et al., 2005). These effects of AgRP neurons are mediated 
by release of GABA as well as two neuropeptides, AgRP and 
NPY, that stimulate food intake when delivered into the brain 
(Clark et al., 1985; Fan et al., 1997; Ollmann et al., 1997; Tong 
et al., 2008). POMC neurons by contrast are activated by energy 
surfeit and their activity inhibits food intake and promotes weight 
loss. These two cell types interact in part through a common set 
of downstream neural targets that express melanocortin recep- 
tors, which are activated by POMC and inhibited by AgRP (Fan 

CrossMark 



et al., 1997; Ollmann et al., 1997; Seeley et al., 1997). Thus, 
AgRP and POMC neurons are two intermingled, interacting neu- 
ral cell types that have opposing roles in the control of feeding. 

Despite intense investigation of these cells over the past 20 
years, their activity dynamics during behavior remain unknown. 
This knowledge gap reflects the difficulty of recording cell- 
type-specific neural activity within heterogeneous deep brain 
structures such as the hypothalamus. As a result, our current 
understanding of the regulation of AgRP and POMC neurons is 
based on a combination of approaches that include in vitro elec- 
trophysiology, c-fos staining, pharmacology, and genetic manip- 
ulations. These pioneering studies have revealed a dominant 
role for circulating hormones and nutrients in the control of these 
cells (Williams and Elmquist, 201 2). AgRP and POMC neurons are 
modulated by hormones suchasghrelinand leptin (Cowley et al., 
2001, 2003; Nakazato et al., 2001; Pinto et al., 2004) as well 
as circulating nutrients (Blouet and Schwartz, 2010) in part 
via their metabolic effects on mitochondrial dynamics (Dietrich 
et al., 2013; Schneeberger et al., 2013). Together, these findings 
have led to a generally accepted model in which AgRP and POMC 
neurons function as interoceptors that monitor the concentration 
of hormones and nutrients in the blood and then gradually 
adjust their activity in parallel with changes in nutritional state. 
This model provides a compelling explanation for how nutritional 
changes can be translated into counterregulatory responses but 
leaves unanswered the question of whether these neurons are 
also subject to rapid regulation on the timescale of behavior. 

AgRP and POMC neurons also receive abundant synaptic 
input which provides the potential for more rapid modulation. 
However, the function of this afferent input is not well under- 
stood. Fasting increases excitatory tone onto AgRP neurons 
(Liu et al., 2012; Yang et al., 201 1), and one source of such excit- 
atory input is neurons in the paraventricular hypothalamus (PVH) 
(Krashes et al., 201 4). AgRP neurons also receive inhibitory input 
from the dorsomedial hypothalamus (DMH) among other sour- 
ces (Krashes et al., 2014). POMC neurons by contrast receive 
inhibitory input from cells in the ARC, including neighboring 
AgRP neurons, as well as excitatory input from the ventromedial 
hypothalamus (VMH) and other regions (Cowley et al., 2001; 
Krashes et al., 2014; Pinto et al., 2004; Sternson et al., 2005; 
Vong et al., 2011). As these circuit connections have only 
recently been described, their regulation and function are not 
yet clear. An important open question regards the nature of the 
information that these presynaptic cells communicate to their 
AgRP and POMC targets. 

In the present study, we have used an optical approach to 
record the natural activity of AgRP and POMC neurons in awake 
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Figure 1. Optical Recording of AgRP and POMC Neuron Activity in Awake Behaving Mice 

(A) FLEX AAV used to drive GCaMP6s expression. 

(B) Response of AgRP and POMC neurons to current ramp. Scale bar represents GCaMP6s fluorescence normalized to 1 .0 at start of the experiment (Fn). 

(C) Membrane potential and GCaMP6s fluorescence in response to sequential 10 pA current steps of duration 2 s separated by 20 s. 

(D) Relationship between action potential number and fluorescence for cells in (C). 

(E) R^ and p values for the linear regression of fluorescence versus action potential number for 1 6 POMC and 1 4 AgRP neurons. 

(F) Schematic of the fiber photometry setup. 

(G) Coronal section from AgRP and POMC mice showing path of optical fiber and injection site. Scale bar represents 1 mm. 

(H) Fluorescence trace during cage exploration for mice expressing GCaMP6s or GFP in AgRP neurons or POMC neurons. 

All error bars represent ± SEM. 

See also Figure S1 . 



behaving mice. These experiments have unexpectedly revealed 
that AgRP and POMC neurons are strongly regulated in vivo by 
the sensory detection of food. This rapid sensory regulation re- 
sets the activation state of these cells induced by food depriva- 
tion prior to the start of food consumption. This rapid regulation 
also contains information about the food’s hedonic properties 
and depends on the animal’s nutritional state. These findings 
reveal that AgRP and POMC neurons receive real-time informa- 
tion about the availabiiity of food in the outside world, which they 
then use to anticipate the nutritional value of a forthcoming meal 
and adjust their activity in advance. This anticipatory regulation 
provides a mechanism to rapidly inhibit foraging upon food 
discovery, suggesting a primary role for these neurons in the 
regulation of appetitive behaviors in vivo. 



RESULTS 

Optical Recording of AgRP and POMC Neuron Activity in 
Awake Behaving Mice 

In order to gain deeper insight into the regulation of AgRP and 
POMC neurons, we sought to record their natural activity during 
feeding behavior. To do this, we used fiber photometry (Cui 
et al., 2013; Gunaydin et al., 2014), an approach that employs 
a multimode optical fiber to record the total fluorescence 
from a population of neurons expressing a calcium reporter 
for neural activity (Figure 1F). By targeting the calcium reporter 
to a specific cell type, this method enables optical recording 
of the real-time activity of a moleculariy defined population 
of neurons within a deep brain structure. The resulting trace 
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represents the integrated activity of the neurons defined 
by a genetic marker and anatomic iocation and therefore is 
particuiariy weii-suited for use in the hypothaiamus, which con- 
tains geneticaiiy separabie popuiations of neurons with distinct 
functions. 

We first confirmed that caicium signais from AgRP and 
POMC neurons correlate with changes in firing rate ex vivo. 
We targeted the sixth generation calcium reporter GCaMP6s 
(Chen et al., 2013) to AgRP and POMC neurons by stereotaxic 
injection of Cre-dependent AAVs into AgRP-IRES-Cre and 
POMC-Cre mice (Figure 1A). We then prepared acute brain 
slices for imaging, and fluorescent cells in the ARC were iden- 
tified for whole-cell current clamp recordings. Activation by 
depolarizing current ramp (0-40 pA, 10 s) induced bursts 
in firing accompanied by increased GCaMPOs fluorescence 
(Figure IB). To quantify the relationship between firing rate 
and fluorescence signal, we applied step currents (-20 pA 
to +120 pA, 10 pA increments), which resulted in progressive 
increases in spikes and fluorescence (Figure 1C). Quantification 
of this response revealed a linear correlation between action 
potential number and GCaMPOs signal (Figures 'D and IE). 
Thus, GCaMPOs can report on activity dynamics in AgRP and 
POMC neurons as shown for other cell types (Chen et al., 
2013). 

To apply this approach in vivo, we injected AAVs expressing 
GCaMPOs into the ARC of the corresponding Cre mice and in 
the same surgery installed an optical fiber unilaterally above 
the ARC (Figure SI). After allowing 2 weeks for transgene 
expression, we connected mice to a photometry rig and re- 
corded fluorescence from these cells as mice explored a feeding 
chamber without access to food. Baseline recordings from AgRP 
and POMC neurons showed dynamic fluctuations (~10%-20% 
AF/F) that resembled bursts of synchronous activity observed 
in other cell types (Cui et al., 2013; Gunaydin et al., 2014) (Fig- 
ure 1 FI). These dynamics were unrelated to mouse movement, 
unaffected by changes in ambient lighting, and absent from re- 
cordings from control mice expressing GFP in AgRP or POMC 
neurons (Figure 1FI), indicating that they represent calcium- 
dependent GCaMPOs signals. 

To test the sensitivity of this assay to detect changes in neu- 
ral activity, we challenged mice with ghrelin, a hormone that ac- 
tivates AgRP neurons and inhibits POMC neurons (Cowley 
et al., 2003; Nakazato et al., 2001). Mice expressing GCaMPOs 
in either AgRP or POMC neurons were acclimated to a behav- 
ioral chamber, given an intraperitoneal injection of ghrelin, and 
then returned to the chamber. Ghrelin sharply increased cal- 
cium signals from AgRP neurons (AF/F = 71% ± 10% at 
5 min, p < 0.001 compared to vehicle) (Figures 2A and 2B; 
Movie SI). This increase began within seconds of injection 
(mean latency = 33 ± 7 s) and reached a plateau within 2 min 
(t = 76 ± 12 s, where t is the exponential time constant corre- 
sponding to the time after injection resulting in ^63.8% of the 
total change). In the absence of further intervention, this in- 
crease in AgRP activity was sustained for the duration of the 
recording (AF/F = 62% ± 10% at 15 min) (Figure 2B). By 
contrast injection of vehicle (PBS) had no effect on the activity 
of AgRP neurons (AF/F = -3% ± 2% at 5 min) (Figure 2B; 
Movie SI). 



POMC neurons showed the opposite response, with ghrelin 
injection rapidly and potently inhibiting POMC activity (t = 
160 ± 17 s; AF/F = -49% ± 4% at 15 min, p = 0.001 compared 
to vehicle) (Figures 20 and 2D; Movie S2). Interestingly, vehicle 
injection alone produced a small but reversible drop in POMC 
activity (Figure 2D; Movie S2). This transient decline in POMC 
activity was consistently observed following animal handling, 
suggesting that POMC but not AgRP neurons receive an inhibi- 
tory stress regulated input. 

We next tested the effect of food on the response to ghrelin. 
Our prediction based on the known nutritional regulation of these 
cells was that food consumption would gradually inhibit AgRP 
neurons and activate POMC neurons as animals transitioned 
from hunger to satiety. To test this, we challenged mice with 
ghrelin and then 20 min later presented them with a pellet of 
chow. Unexpectedly, we found that food presentation alone 
rapidly reversed much of the effect of ghrelin treatment (AF/ 
F = -29% ± 3% at 2 min for AgRP neurons and AF/F = 80% ± 
3% at 2 min for POMC neurons) (Figure 2). This response began 
immediately upon placing food in the cage and was complete 
within seconds (t = 12 ± 2 s for AgRP neurons; t = 44 ± 3 s for 
POMC neurons). All animals tested showed this response to 
food presentation (traces for ten mice are shown in Figure 2E), 
suggesting that it represents a general mechanism that regulates 
the activity of these neurons in vivo. 

Food Detection Reverses the Effects of Fasting on AgRP 
and POMC Activity 

The regulation of AgRP and POMC neurons by sensory detection 
of food has not previously been described. To investigate this 
phenomenon under more physiologic conditions, we fasted 
mice overnight and then presented a pellet of chow. As observed 
for ghrelin-treated animals, food presentation to fasted mice 
strongly inhibited AgRP neurons (AF/F = -37% ± 4%, at 
5 min, p < 0.001 compared to object) and activated POMC neu- 
rons (AF/F = 38% ± 5% at 5 min, p < 0.001 compared to object) 
(Figure 3; Movies S3 and S4). These responses began the 
moment that food was presented and were rapidly complete 
(t = 20 ± 4 s for AgRP neurons and t = 42 ± 1 8 s for POMC neu- 
rons). To quantify the extent to which these changes required 
food consumption, we analyzed video data to estimate the 
moment at which the first bite of food was consumed in each trial 
and then aligned calcium traces to this event. This revealed that 
most of the activity changes in these neurons were already com- 
plete by the time food intake was initiated (96% ± 6% complete 
before feeding in AgRP neurons, 85% ± 5% in POMC neurons) 
(Figures 3FI and 31). Thus, the response of AgRP and POMC neu- 
rons to food is triggered primarily by food detection rather food 
consumption. Of note, these stereotyped responses to food pre- 
sentation were consistently observed in the first trial of each 
mouse (Figure 3G), indicating that this effect does not require 
prior training. 

We investigated the determinants of this rapid response to 
food discovery. Presentation of an inedible object (a rubber stop- 
per similar in size to a piece of chow) had little effect on the ac- 
tivity of AgRP neurons (AF/F = 4.9% ± 2.2%) and induced a small 
change in POMC neurons in the opposite direction of food (AF/ 
F = -10% ± 2%). Thus, the response of these neurons to food 
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Figure 2. Ghrelin Rapidly Modulates AgRP and POMC Neurons 

(A and C) Recordings from a mouse expressing GCaMP6s in AgRP or POMC neurons that was chalienged with injection of ghreiin (light gray) foilowed by 
presentation of a peilet of chow (dark gray). 

(B and D) Caicium signais from AgRP and POMC neurons aiigned to the time of PBS or ghreiin injection, or chow presentation to ghreiin-treated mice. Red and 
gray indicate the mean response and SE (AgRP, n = 7; POMC, n = 5). in each trial fluorescence was normalized by assigning a value of 1 .0 to the median value of 
data points within a 2-min window at —5 min before treatment. 

(E) Peri-event plots showing the response from a single trial of five AgRP mice and five POMC mice. 

All error bars represent ± SEM. 

See also Movies S1 and S2. 



presentation is food-specific (Figure 3; Movies S3 and S4). The 
sensitivity of these cells to food presentation also depended on 
nutritional state, as AgRP neurons from ad-libitum-fed mice 
showed no response to food presentation (AF/F = -4.7% ± 
1.0%, p = 0.21 compared to object) whereas POMC neurons 
from ad-libitum-fed mice showed a greatly diminished response 
(AF/F = 4.7% ± 2.4%, p = 0.01 compared to object) (Figures 3E 
and 3F). Thus, conditions that reflect energy deficit, such as 
fasting or ghrelin treatment, potentiate the response of AgRP 
and POMC neurons to food detection. 

Food Quality Influences the Magnitude of the Response 

We considered the possibility that the response of AgRP and 
POMC neurons to food presentation depends on the food’s 
hedonic properties. In this regard, sensory cues associated 
with palatable or energy dense foods trigger activation of brain 
regions involved in reward, but how this hedonic information is 
integrated with homeostatic signals remains poorly understood. 
To investigate this, we first measured the response to peanut 
butter, an energy dense food that mice will eat in preference to 



chow and is considered rewarding. Mice were fasted overnight, 
acclimated to a behavioral chamber, and then presented with 
either pellet of chow or a dollop of peanut butter. Presentation 
of peanut butter strongly inhibited AgRP neurons (AF/F = 
-54% ± 6% at 5 min) (Figure 4A) and activated POMC neurons 
(AF/F = 101% ± 31% at 5 min) (Figure 4C). These responses 
began immediately upon food presentation (Movies S5 and S6) 
and were complete in <1 min (t = 23 ± 6 s for AgRP and t = 
29 ± 6 s for POMC). The responses to peanut butter were signif- 
icantly larger than the responses to chow (Figure 4E) and indeed 
were comparable in magnitude (but opposite in sign) to the effect 
of injection with pharmacologic doses of ghrelin (Figure 4F), 
which to our knowledge is the strongest known stimulus that 
modulates these cells. 

A defining feature of palatable foods is that animals will 
consume them in the absence of hunger because they are intrin- 
sically rewarding (e.g., eating dessert after a meal). We therefore 
tested whether AgRP and POMC neurons from ad-libitum-fed 
mice, which show little or no response to chow (Figure 3), would 
nonetheless respond to the presentation of peanut butter. Indeed, 
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Figure 3. Sensory Detection of Food Rapidly Regulates AgRP and POMC Neurons 

(A and D) Recordings from fasted mice expressing GCaMP6s in AgRP or POMC neurons presented with a peliet of chow (gray). 

(B and E) Piots of caicium signais from AgRP and POMC neurons aiigned to the time of presentation of a peiiet of chow (red) or inedibie object (biack). Mice were 
either subjected to an overnight fast (left) or fed ad iibitum (right) prior to experiment. Gray indicates SE (AgRP, n = 1 0; POMC, n = 5). 

(C and P) Ouantification of fluorescence changes 5 min after event, as indicated. 

(G) Peri-event plots aligned to the time of event. Each row is a single trial of a different mouse. 

(H) Calcium signals aligned to the initiation of feeding for AgRP and POMC neurons. 

(I) Quantification of change in fluorescence that occurs before feeding is initiated versus the total change in the trial. *p < 0.05. **p < 0.01,***p < 0.001,****p < 
0 . 0001 . 

All error bars represent ± SEM. 

See also Movies S3 and S4. 



we found that presentation of peanut butter to ad-libitum-fed mice 
strongiy inhibited AgRP neurons (AF/F = -24% ± 4%, at 5 min, 
p < 0.001 compared to chow) and activated POMC neurons 
(AF/F = 55% ± 1 1 %, at 5 min, p = 0.1 4 compared to chow) (Figures 
4A and 4C). Thus, more paiatabie food can moduiate these 
neurons even in the absence of signals of energy deficit. 

To further probe this reiationship, we tested whether the 
response of these neurons to different foods depended on the 
order in which they were presented. Mice were fasted overnight 
and then sequentiaiiy presented with an inedibie object, peanut 



butter, or chow in randomized order at 1 0-min intervals. We then 
calculated the change in activity that occurred following each of 
these presentations. This revealed that presentation of peanut 
butter could completely block the subsequent neural response 
to presentation of chow (Figures 4B and 4D). By contrast, pre- 
sentation of chow had no effect on the response to peanut butter 
in POMC neurons (Figure 4D) and only partially diminished the 
response in AgRP neurons (Figure 4B). The asymmetry in the 
response to these two foods is consistent with their differential 
effects in fasted and fed mice. 
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Figure 4. Food Palatability Determines the Magnitude of the Response to Food Detection 

(A and C) Calcium signals from AgRP and POMC neurons in fasted and fed mice aligned to the time of presentation of peanut butter or chow. 

(B and D) Fluorescence change of AgRP and POMC neurons upon sequential presentation of an inedible object, chow, and peanut butter in fasted mice. 

(E) Ouantification of responses of AgRP and POMC neurons 5 min after food presentation. 

(F) Plot showing the response of AgRP and POMC neurons over 5 min to different foods and pharmacologic treatments in the context of varying nutritional states. 
All traces start at the origin (0,0) and emanate outward. Arrows indicate the direction of movement. 

All error bars represent ± SEM. 

See also Figure S2 and Movies S5 and S6. 



To extend these findings we tested a chocolate, a second food 
that is commonly used in rodent studies of reward. We found that 
presentation of chocolate (Hershey Kiss) to fasted mice inhibited 
AgRP neurons to a greater extent than chow (Figure S2A). Like 
peanut butter, chocolate also elicited a response in AgRP neu- 
rons from ad-libitum-fed mice that are unresponsive to chow 
(Figure S2B). Sequential presentation experiments revealed 
that chocolate could block the neural response to subsequent 
presentation of chow, but not vice versa, similar to our observa- 
tions with peanut butter (Figures S2D and S2E). Although choc- 
olate was a novel food for these animals, we observed responses 
to chocolate presentation in the first trial, indicating mice could 
identify it as food without prior experience. Flowever, the speed 
of the response to chocolate increased during subsequent tests, 
suggesting involvement of a learning process as well (t = 40 ± 8 s 



in trial 1 versus 17 ± 2 s in trial 4, p < 0.01) (Figure S2C). Collec- 
tively, these data show that the rapid sensory regulation of AgRP 
and POMC neurons contains information about the hedonic 
properties of the food that has been detected. 

Food Accessibility Modulates the Response to Food 
Discovery 

Most of the response of AgRP and POMC neurons to food 
presentation occurred before food intake was initiated (Figures 
3FI and 31). We therefore wondered whether food consumption 
played any role in this response. To test this, mice were fasted 
overnight and then presented with peanut butter in a container 
that allowed the food to be seen and smelled but not consumed 
(Figure 5A). Presentation of this inaccessible peanut butter 
rapidly activated POMC neurons (AF/F = 43% ± 9% after 
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2 min; t = 31 ± 8 s) and inhibited AgRP neurons (AF/F = -39% ± 
4% after 2 min; t = 21 ± 4 s) (Figures 5B and 5C). Simiiar re- 
sponses were observed in mice pretreated with ghreiin (Figures 
S3A and S3B). These responses occurred as quickiy as the 
response to accessibie food, but were somewhat smaiier in 
magnitude (Figures S3C and S3D), and the response of POMC 
neurons was iess durabie (Figures 5B and 5C). This indicates 
that food accessibiiity can moduiate the strength of the response 
to food presentation. 

To further dissect this effect, we tested whether an isoiated 
sensory cue couid moduiate the activity of these two celi types. 
As mice reiy heavily on the sense of smeli, we tested whether the 
smeli of peanut butter couid eiicit an activity change in AgRP and 
POMC neurons. Mice were fasted overnight and then exposed to 
peanut butter piaced underneath the cage in a covered container 
so that it couid be smeiied but not seen or accessed (Figure 5D). 
We found that this “hidden peanut butter” rapidiy moduiated 
AgRP and POMC neurons in a way that resembied food presen- 
tation (AF/F = -12% ± 5% after 1 min in AgRP neurons and AF/ 
F = 1 7% ± 6% after 1 min in POMC neurons) (Figures 5E and 5F). 
Flowever, this effect was much smaiier in magnitude and tran- 
sient, with neurai activity returning to baseline within 8 min (AF/ 
F = 8.3% ± 4.5% after 8 min in AgRP neurons and AF/F = 
-3.0% ± 4.0% after 8 min in POMC neurons) (Figures 5F and 
S3). Together, these data suggest that food-associated sensory 
cues can moduiate these two ceii types, but that the magnitude 
and durabiiity of this response depends on the extent to which 
these cues are interpreted as representing access to food. 

Food Removal Reverses the Effects of Food 
Presentation 

The response of AgRP and POMC neurons to food presentation 
is consistent with a modei in which these neurons anticipate the 
change in their activity that wiii occur after food consumption and 
then enact this change in advance, taking into account factors 
such as the food’s energy density, the food’s accessibiiity, and 
the animai’s nutritionai state. A prediction of this modei is that 
the response to food presentation shouid be reversed if the 
food is removed before it can be consumed. To test this, mice 
were fasted overnight, presented with accessibie chow, and 
then the food was removed after either a 2-, 1 0-, or 30-min inter- 
vai. As predicted, we found that food removai reversed the ef- 
fects of food presentation, resuiting activation of AgRP neurons, 
and inhibition of POMC neurons (Figures 5G and 5J; for ciarity 
oniy data after 2 and 1 0 min removai are shown). The magnitude 
and kinetics of this reversai depended on the duration that mice 
were given food access. For exampie, mice given access to food 
for 30 min showed a smaiier reversai of AgRP and POMC neuron 
activity foiiowing food removai than mice given access to food 
for 2 or 10 min (Figures 5FI and 5K). Extended food access 
aiso slowed the response to food removal in AgRP but not 
POMC neurons (Figures 5i and 5L). These findings are consistent 
with food consumption during the feeding intervai partiaiiy reset- 
ting the activation state of these neurons. 

The response to food removai exhibited hysteresis, occurring 
~1 0-foid more siowiy than the initiai response to food presenta- 
tion (Figures 51 and 5L). This asymmetry was observed after only 
2 min food access in both AgRP (t = 15 ± 1 s versus 258 ± 26 s. 



p < 0.0001) and POMC neurons (t = 19 ± 3 s versus 269 ± 66 s, 
p = 0.03) and therefore was unlikeiy to be caused by the post- 
ingestive effects of food consumption. Rather, this suggests 
that the circuit interprets the sensory detection of food in such 
a way that food removai induces a more graduai change than 
food discovery. 

Neural Dynamics within Feeding Bouts 

We have focused on the initiai response of AgRP and POMC 
neurons to food presentation, because this response is much 
iarger than the fiuctuations in the activity of these neurons that 
occur during feeding (Figures 3A and 3D). Flowever, we consid- 
ered the possibiiity that these smaiier intrameai dynamics might 
aiso be correiated with components of behavior. To test this, we 
switched to a system in which mice were fed a iiquid diet (vaniiia 
Ensure) via a iickometer so that we couid aiign individuai feeding 
bouts to photometry data with miiiisecond precision. 

Mice were transitioned from a soiid to iiquid diet over severai 
days, then fasted overnight and tested in a 1-hr triai. Licks 
were aiigned to photometry traces and individuai feeding bouts 
defined as ciusters of iicks separated from their nearest neighbor 
by >20 s. This resuited in identification of an average of 17 ± 2 
feeding bouts in each 1-hr triai, with each bout iasting an average 
of 17 ± 3 s and containing 53 ± 10 licks. The start of each bout 
in a representative triai is indicated by gray lines in Figures 6A 
and 6B. 

We compared the average activity of these neurons during 
active feeding (intrabout) versus intermeai intervais (interbout), 
by caicuiating the difference in fiuorescence between these 
stages (interbout - intrabout). This reveaied that POMC neurons 
were more active during feeding whereas AgRP neurons were 
iess active (AFn = 0.042 ± 0.011 for AgRP versus AFn = 
-0.029 ± 0.004 for POMC, p = 0.001) (Figure 60). To investigate 
the dynamics underiying these differences, we aiigned each 
feeding bout so that the start of the bout (first iick) corresponded 
to time zero and then anaiyzed a 10-s window fianking this 
moment. We found that AgRP and POMC neurons showed a 
consistent pattern of activity that predicted the onset of each 
meai. AgRP neurons deciined in activity untii the moment of 
the first iick and then their activity fiattened (Figure 6D), whereas 
POMC neurons increased in activity prior to and throughout the 
start of feeding (Figure 6E). Cross-correiation anaiysis between 
AgRP and POMC showed that there was an inverse correiation 
between the activity of these two ceii types that reached a 
peak at approximate^ time zero (Figure 6F). These effects 
were tightly linked to behaviorai state, as they were robust to 
changes in the definition of a feeding bout (e.g., changes in the 
minimum intermeai intervai) yet were compieteiy absent when 
the data were re-anaiyzed using randomiy generated feeding 
bouts (Figures 6D and 6E, biack). Remarkabiy, these intrameai 
anticipatory changes in AgRP and POMC activity appear to reca- 
pituiate, on a smaiier scaie, the dramatic changes in activity that 
occur in these neurons in response to food presentation. 

Dynamics of AgRP Projections to the PVH 

AgRP neurons project broadiy to brain regions invoived in the 
controi of food intake in a primariiy one-to-one configuration 
(Betiey et ai., 2013). Optogenetic experiments have identified 
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Figure 5. The Response to Food Detection Depends on Food Accessibility and Is Reversible 

(A) Schematic of caged peanut butter. 

(B) Calcium signals aligned to the time of presentation of a caged peanut butter. 

(C) Change in fluorescence 1 and 8 min after caged peanut butter presentation. 

(D) Schematic of hidden peanut butter. 

(E) Calcium signals aligned to the time of presentation of hidden peanut butter. 

(F) Change in fluorescence 1 and 8 min after hidden peanut butter presentation. 

(G and J) Chow was presented at time 0, and then food was removed at 2 min (red), 10 min (blue) or not removed (black). 

(legend continued on next page) 
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Figure 6. Intrameal Dynamics of AgRP and POMC Neurons 

(A and B) Traces of AgRP and POMC activity in mice during consumption of a iiquid diet. Licks that mark initiation of a feeding bout are shown in gray. 

(C) Difference in average fluorescence between periods of feeding (intrabout) and intermeai intervais (interbout) for each mouse. 

(D and E) Caicium signaisfrom AgRP and POMC neurons aiigned to the moment of the first iick that initiates a feeding bout. Data from actuai feeding bouts shown 
in red; data from simuiated randomiy generated feeding bouts in biack. 

(F) Cross-correlation plots showing the correiation between activity of AgRP and POMC neurons before and after licking. Red is mean, gray is 28 individuai 
comparisons between AgRP (n = 7) and POMC (n = 4) mice. 

(G and H) Peri-event piots showing theactivity of AgRP and POMC neurons aiigned to the start of feeding bouts. The top plot shows all of the bouts for one trial of a 
mouse. The bottom plot shows the average response across all bouts for seven AgRP and four POMC mice. 

All error bars represent ± SEM. 



AgRP projections to the PVFI as being particuiariy important for 
the controi of feeding (Atasoy et al., 2012). As fiber photometry 
enabies direct monitoring of axonai caicium transients (Gunaydin 
et ai., 2014), we sought to record the activity of these key AgRP 
(ARC ^ PVH) projections during behavior. 

AAVs expressing Cre-dependent GCaMPGs were deiivered to 
the ARC of AgRP-IRES-Cre mice and in the same surgery an 



opticai fiber was impianted ipsiiateraily in the PVH (Figure 7A). 
Photometry recordings 4 weeks after surgery reveaied sponta- 
neous synchronous activity in these projections (Figure 7B) 
that resembied calcium dynamics observed in AgRP cell bodies 
(Figure 1 H), albeit somewhat smaller in magnitude. Intraperito- 
neal injection of ghrelin, but not vehicle, induced a rapid increase 
in calcium signals in these projections (AF/F = 17% ± 5% for 



(H and K) Recovery in fluorescence 20 min after food removal for experiments in which food was removed after 2, 1 0, or 30 min. 
(I and L) Time constant for the response to upon food presentation and food removal after 2 and 10 min. 

All error bars represent ± SEM. 

See also Figure S3. 
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Figure 7. Natural Dynamics of AgRP Projections to the PVH 

(A) Schematic showing infection of cell bodies in the ARC and installation of 
optical fiber in the PVH. Scale bar represents 0.5 mm. 

(B) Recording from PVH of a fasted mouse presented sequentially with an 
inedible object, peanut butter, and chow. 

(C and E) Calcium signals from PVH of mice presented sequentially with an 
inedible object, peanut butter, and chow. 

(D and F) Quantification of calcium signals 5 min after event (n = 4 mice). 

(G) Model for regulation of AgRP and POMC neurons by homeostatic and 
sensory information. 

All error bars represent ± SEM. 

See also Figure S4. 



ghrelin versus -9% ± 3% for PBS at 5 min, p = 0.02) (Figure S4), 
indicating that they are appropriateiy reguiated by hormonai 
signais. 

We next tested the effect of food presentation. Mice were 
fasted overnight and then presented with either an inedibie ob- 
ject, chow, or peanut butter. Presentation of either chow or pea- 
nut butter rapidly and potently inhibited calcium dynamics in 
AgRP (ARC ^ PVH) projections (AF/F = -30% ± 2% for peanut 
butter versus -21 % ± 3% for chow at 5 min) whereas presenta- 
tion of an inedible object had no effect (Figures 7C and 7E). Of 
note, peanut butter almost completely eliminated detectable 
synchronous activity in PVH axons (Figures 7B and 70), suggest- 
ing that palatable food presentation is particularly potent in 
suppressing the activity of this pathway. Assays utilizing sequen- 
tial food presentation revealed a pattern of responses in PVH 
projections that closely resembled responses observed in 
AgRP cell bodies (Figures 7D and 7F). Likewise, chow presenta- 
tion partially reversed the activation of these PVH projections 
by ghrelin (Figure S4). Thus, the activity of AgRP (ARO -> PVH) 
projections is regulated by ghrelin and food presentation in a 
way that mirrors the population response in the ARC. 

DISCUSSION 

It has been known for more than 75 years that the hypothalamus 
plays a critical role in the control of food intake (Hetherington and 
Ranson, 1939), yet the dynamics of the hypothalamic circuits 
that give rise to feeding behavior have remained a mystery. 
Here, we have used an optical approach to record the natural 
dynamics of the two most widely studied cell types that control 
feeding, AgRP and POMC neurons, in awake behaving mice. 
These experiments have revealed unexpectedly that these neu- 
rons are potently regulated by the sensory detection of food. This 
rapid regulation resets the activation state of AgRP and POMC 
neurons induced by orexigenic signals such as ghrelin or fasting. 
The magnitude and robustness of this response suggests that it 
is a primary mechanism that controls the activity of these neu- 
rons in vivo. The speed of this response suggests that it is likely 
mediated by neural input. The dependence on food palatability 
suggests that this response contains information about the 
food’s hedonic properties or energy content, possibly through 
a learned association with smells or other sensory cues. Collec- 
tively, these findings reveal that AgRP and POMC neurons 
receive real-time information about the availability of food in 
the external world, which they then integrate with homeostatic 
signals arising from the body (Figure 7G). This demonstrates a 
more complex and dynamic role for these circuits in the control 
of feeding behavior than is currently appreciated. 

Sensory Feedback Enables Rapid Inhibition of 
Appetitive Processes 

The rapid sensory regulation of AgRP and POMC neurons is 
counterintuitive, since it appears to “short circuit” their well-es- 
tablished function as interoceptive sensors of nutritional state. In 
this model, energy deficit activates AgRP neurons and inhibits 
POMC neurons, thereby generating a motivational drive that pro- 
motes food intake and is only relieved when energy stores are re- 
plenished. An assumption of this model is that internal signals 
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generated during feeding (e.g., accumulation of circulating nutri- 
ents or hormones) are responsible for resetting the activation 
state of these neurons and thereby reducing the drive to eat. 

Our data, by contrast, show that food detection alone rapidly 
resets the activity of these two cell types and that this resetting 
precedes the onset of actual food consumption. This is surpris- 
ing in light of the fact that stimulation of AgRP neurons is suffi- 
cient to promote food intake (Aponte et al., 2011; Krashes 
et al., 2011). However, our data also show that if food is removed 
before it can be consumed, then these neurons revert to their 
activity level prior to food presentation (Figures 5G and 5J). We 
have likewise found that inaccessible food induces smaller and 
less durable changes in AgRP and POMC neuron activity (Fig- 
ures 5C and 5F). Together, these findings suggest that food 
detection modulates AgRP and POMC neurons in a way that an- 
ticipates the change in their activity that will occur following food 
consumption, taking into account factors such as the food’s 
energy density, perceived accessibility, and the nutritional state 
of the mouse (Figure 7G). 

What is the purpose of this anticipatory regulation? We pro- 
pose that it represents a mechanism to rapidly inhibit foraging 
and other appetitive behaviors once food has been discovered 
(Figure 7G). In this regard, activation of AgRP neurons induces 
not only food consumption but also motivational processes 
that drive food obtainment, including dramatic foraging behavior 
and a willingness to work for food (Atasoy et al., 2012; Krashes 
et al., 2011). These appetitive processes are blocked by food 
discovery as part of the natural transition from foraging to 
feeding, but the mechanisms by which this transition is regulated 
have not been described. Our data show that food discovery re- 
sults in rapid feedback inhibition of AgRP neurons themselves, 
rather than some downstream circuit element, which provides 
a direct mechanism to inhibit foraging once food has been ob- 
tained. The fact that this feedback occurs at the level of AgRP 
neurons is surprising and suggests that the activity of these neu- 
rons is particularly important for generating the motivation to 
search for food relative to other aspects of feeding behavior. 

Models for AgRP-Driven Food Consumption 

The natural dynamics of AgRP neuron activity are consistent 
with a primary function for these neurons in regulating appetitive 
behaviors that promote food discovery. Yet multiple lines of 
evidence have suggested a role for these neurons in controlling 
food consumption as well. We discuss below two possible 
mechanisms by which AgRP neurons could drive food intake 
that are consistent with our data. 

Subpopulations of AgRP Neurons with Specialized 
Functions 

A limitation of fiber photometry is that it measures the average 
activity of a population of a neurons, which can mask heteroge- 
neity in the responses of individual cells. AgRP neurons that proj- 
ect to different downstream targets differ in their ability to induce 
food intake and in their expression of the leptin receptor (Atasoy 
et al., 201 2; Betley et al., 201 3; Wu et al., 201 2). It is therefore un- 
likely that all AgRP neurons show identical responses to stimuli 
such as hormone challenge or food presentation. One possibility 
is that a subset of AgRP neurons are activated, rather than in- 
hibited, by food presentation, and this subpopulation of AgRP 



neurons is responsible for driving food consumption. Testing 
this possibility will require measuring the single-cell dynamics 
of AgRP neurons during behavior, using approaches such as 
optogenetic phototagging combined with in vivo recording 
(Lima et al., 2009) or fluorescence microendoscopic imaging 
(Ziv et al., 2013). 

While future experiments are likely to uncover additional 
heterogeneity in these cells, three observations argue against 
this heterogeneity being the primary explanation for how AgRP 
activity drives food consumption. First, the magnitude of the 
decrease in AgRP calcium dynamics that we observe following 
food presentation, particularly for palatable foods (Figure 4F), 
is inconsistent with a major subset of these neurons having the 
opposite regulation. Therefore, if some AgRP neurons are acti- 
vated during feeding, they must represent a minority of the pop- 
ulation. Second, our analysis of AgRP dynamics during individual 
feeding bouts reveals that AgRP activity declines immediately 
preceding meal initiation and then is relatively flat during the 
course of food intake (Figure 6D). These intrameal dynamics 
are not what would be predicted for a neuron whose activity 
directly drives food consumption. Third, and most importantly, 
we have shown that food presentation potently inhibits AgRP 
projections to the PVH (Figure 7). Optogenetic experiments 
have strongly implicated these ARC -> PVH projections in the 
control of food intake (Atasoy et al., 2012; Betley et al., 2013). 
The fact that these PVH projections show the same activity 
pattern as the population as a whole argues that projection-spe- 
cific dynamics are unlikely to be the primary explanation for how 
these neurons can drive feeding. 

Learning Mediated by AgRP Activity 
An alternative possibility is that AgRP neurons drive food con- 
sumption indirectly via a learning process. In this regard, we 
have shown that the inhibition of AgRP activity following food 
discovery is contingent on subsequent food intake, since this 
inhibition is reversed if the food is removed before it can be 
consumed (Figure 5G). If AgRP activity has negative motivational 
valence (analogous to the unpleasant sensation of hunger), then 
this might enable animals to learn the consequences of failing to 
eat after obtaining food. In this model, food discovery would 
temporarily relieve the sensation of hunger, but animals would 
learn through experience that this sensation returns if the food 
is not consumed. Over time, this would result in appetitive and 
consummatory aspects of feeding becoming linked in sequence 
so that food discovery is always followed by food intake, even 
though AgRP activity itself would largely be extinguished before 
the onset of feeding. Alternative models based on negative 
reinforcement and learning are also conceivable, and untan- 
gling these possibilities will be an important area for future 
investigation. 

Neural Input into AgRP and POMC Neurons 
Communicates the Discovery of Food 

AgRP and POMC neurons receive abundant neural input, and 
indeed, the activation of AgRP neurons by fasting is mediated 
primarily by increased excitatory tone (Liu et al., 2012; Yang 
et al., 2011). Yet most studies of these cells have focused on 
the role of hormones and nutrients, and the role of this afferent 
neural input has remained unclear. Our data indicate that one 
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function of this neural input is to communicate to AgRP and 
POMC neurons the discovery of food. This is appealing because 
it demonstrates a function for this synaptic input that extends 
beyond merely serving as a redundant source of homeostatic in- 
formation. The fact that the strength of this neural input varies 
depending on the hedonic properties of the detected food sug- 
gests that, at some level, the upstream circuit encodes an asso- 
ciation between sensory information and the food’s nutritional 
content (i.e., a “food memory”). Identification of the neural sub- 
strate of this association may provide an entry point into the 
study of the maladaptive associations between sensory cues 
and food that develop in some eating disorders. As several cell 
types that provide input into AgRP neurons have recently been 
identified (Krashes et al., 201 4), it should be possible to elucidate 
this afferent pathway using modern circuit mapping techniques. 

Information Processing by Arcuate Feeding Circuits 

Feeding is influenced by diverse types of signals including sen- 
sory, hedonic, homeostatic, and visceral cues. A long-standing 
question has been whether there exists a site in the brain where 
the neural circuits that sense these signals converge, thereby 
enabling integration of this information into a single decision to 
eat or not to eat (Friedman, 2014). The arcuate nucleus in this 
model is traditionally viewed as a sensor for homeostatic cues, 
which it then relays to higher centers where more complex inte- 
gration occurs. This viewpoint is encapsulated in the fact that 
AgRP and POMC neurons are often described as “first order” 
neurons, analogous to primary sensory transduction neurons 
such as rods and cones in the visual system. 

A complication for this model as mentioned previously is that 
AgRP and POMC neurons are strongly regulated by neural input 
and therefore are not merely sensors of circulating nutritional sig- 
nals. However, absent an understanding of the function of this 
afferent input, it has not been possible to assemble a complete 
picture of the role of these cells. The discovery that this input 
contains information about the sensory and hedonic properties 
of food reveals that these long-studied neurons themselves 
integrate multiple types of food-related information and indeed 
may represent a key convergence point in the feeding circuit. 
The further application of new methods for recording cell-type- 
specific neural activity is likely to provide additional insight into 
how this complex integration is achieved. 

EXPERIMENTAL PROCEDURES 

Experimental protocols were approved by the University of California, San 
Francisco lACUC following the NIH guidelines for the Care and Use of Labora- 
tory Animals. 

Stereotaxic Surgery 

Recombinant AAV expressing GCaMP6s (AAV1 .Syn.Flex.GCaMP6s) was pur- 
chased from the Penn Vector Core. AAV was stereotaxically injected into the 
ARC of AgRP-lRES-Cre and POMC-Cre mice. In the same surgery, a photom- 
etry cannula was implanted unilaterally in either the ARC or PVH. Mice were 
allowed 2-4 weeks for viral expression and recovery from surgery before 
behavioral testing. 

Slice Electrophysiology and Calcium Imaging 

Acute hypothalamic slices were prepared from 8- to 1 5-week-old AgRP-IRES- 
Cre and POMC-Cre mice expressing AAV GCaMP6s for 2-4 weeks. Fluores- 



cent cells in the ARC were identified for whole-cell patch clamp recordings, 
and cells were activated using step currents (10 pA, 2 s) from -20 pA 
to +120 pA or ramp currents (0-40 pA, 10 s) injected under current clamp 
mode. Calcium imaging was performed simultaneously using a digital CCD 
camera mounted on an Olympus BX51 microscope. 

Immunohistochemistry 

Mice were transcardially perfused with PBS followed by formalin. Brains were 
postfixed overnight in formalin and placed in 30% sucrose for 2 days. Free 
floating sections (40 jim) were prepared with a cryostat, blocked (3% BSA, 
2% NGS, and 0.1 % Triton-X in PBS for 2 hr), and then incubated with primary 
antibody (chicken anti-GFP, Abeam, abl 3970, 1 :1 ,000) overnight at 4°C. Sam- 
ples were washed, incubated with secondary antibody (goat anti-chicken 
Alexa488 secondary antibody; Invitrogen, 1:500) for 2 hr at room temperature, 
mounted, and imaged. 

Fiber Photometry 

A rig for performing fiber photometry recordings was constructed following 
basic specifications previously described (Gunaydin et al., 2014). All experi- 
ments were performed in behavioral chambers (Coulbourn Instruments, 
Habitest Modular System) and video recorded using infrared cameras installed 
above each cage. Experiments were performed at the beginning of the dark 
cycle (CT12-CT14) to control for circadian factors and performed in a dark 
environment with illumination of red or infrared light. Mice were acclimated 
to the behavioral chamber for at least 15 min prior to the beginning of each 
testing session. 

For hormone challenge, ghrelin (60 jig/mouse) or vehicle (PBS) was deliv- 
ered by intraperitoneal injection in a total volume of 200 ).il. For food presenta- 
tion experiments, mice were exposed in their home cage prior to testing to 
both peanut butter and the rubber stopper in order to remove any effects of 
novelty. Mice were not exposed to chocolate prior to testing. Liquid diet exper- 
iments were performed using a behavioral chamber equipped with an optical 
lickometer (Coulbourn Instruments). Mice were habituated to a liquid diet 
(Ensure vanilla flavor) for 3 days prior to the experiment. Mice were then fasted 
overnight, acclimated to the behavioral chamber for 15 min, and then a bottle 
filled with liquid diet was plugged into the lickometer system and the trial was 
run for 1 hr. Photometry data were subjected to minimal processing consisting 
of only autofluorescence background subtraction and within trial fluorescence 
normalization. 

Statistics 

Values are reported as mean ± SEM in the figures and text, p values for pair- 
wise comparison were performed using a two-tailed Student’s t test, p values 
for comparisons across multiple groups were corrected using the Holm-Sidak 
method in Prism. *p < 0.05. **p < 0.01,***p < 0.001 ,****p < 0.0001. 

See also Extended Experimental Procedures. 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures, four 
figures, and six movies and can be found with this article online at http://dx. 
doi.org/10.1016/j.cell.2015.01.033. 
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SUMMARY 

Low energy states delay aging in multiple species, yet 
mechanisms coordinating energetics and longevity 
across tissues remain poorly defined. The conserved 
energy sensor AMP-activated protein kinase (AMPK) 
and its corresponding phosphatase calcineurin 
modulate longevity via the CREB regulated transcrip- 
tional coactivator (CRTC)-I in C. elegans. We show 
that CRTC-1 specifically uncouples AMPK/calci- 
neurin-mediated effects on lifespan from pleiotropic 
side effects by reprogramming mitochondrial and 
metabolic function. This pro-longevity metabolic 
state is regulated cell nonautonomously by CRTC-1 
in the nervous system. Neuronal CRTC-1 /CREB 
regulates peripheral metabolism antagonistically 
with the functional PPARa ortholog, NHR-49, drives 
mitochondrial fragmentation in distal tissues, and 
suppresses the effects of AMPK on systemic mito- 
chondrial metabolism and longevity via a cell-non- 
autonomous catecholamine signal. These results 
demonstrate that while both local and distal mecha- 
nisms combine to modulate aging, distal regula- 
tion overrides local contribution. Targeting central 
perception of energetic state is therefore a potential 
strategy to promote healthy aging. 

INTRODUCTION 

An organism’s energy status is tightiy coupied to its rate of aging, 
as iow energy conditions increase iongevity and disease resis- 
tance across the evoiutionary spectrum (Burkewitz et ai., 
2014). Mechanisms that communicate energetic state between 
tissues to coordinate organismai heaith and iongevity remain 
poorly understood, however, and must be defined in order to 
translate these effects to human therapeutics. AMP-activated 
protein kinase (AMPK) is a conserved energy sensor activated 
by increases in the AMP/ADP:ATP ratio, which signals low en- 
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ergy charge (Hardie et al., 2012). AMPK upregulates catabolic 
processes and shuts down energy-consuming processes to 
restore cellular energy homeostasis (Hardie et al., 2012). AMPK 
is also pro-longevity; activating AMPK in C. elegans and 
Drosophila increases healthy lifespan and mimics a low energy 
state in well-fed animals (Apfeld et al., 2004; Stenesen et al., 
2013). C. elegans lacking AMPKactivity fail to respond to low en- 
ergy conditions, such as dietary restriction, that extend wild-type 
lifespan (Burkewitz et al., 2014). 

Both AMPK and its effects on aging are conserved across 
eukaryotes (Hardie et al., 2012). Metformin, an indirect AMPK 
agonist, promotes healthy aging in C. elegans (Onken and Dris- 
coll, 2010) and mice (Martin-Montalvo et al., 2013). Deregulation 
of AMPK results in age-onset human pathologies including can- 
cer and neurodegenerative diseases (Hardie et al., 2012). AMPK 
signaling therefore plays a critical role linking energetics to pa- 
thology, making it an attractive target to treat or prevent multiple 
age-related diseases. 

AMPK has both cell-autonomous effects on energetics 
through direct phosphorylation of metabolic effectors (Hardie 
et al., 2012) and cell-nonautonomous effects via integration of 
hormonal and neuroendocrine signals (Dagon et al., 2012; Lerner 
et al., 2009; Minokoshi et al., 2004). The extent to which AMPK 
promotes longevity locally via regulation of metabolism or 
distally via a secondary signal remains unclear. Additionally, 
AMPK’s pro-longevity effects may not be universal in all tissues, 
as AMPK activation in certain cell types appears to increase risk 
for some diseases, and pleiotropic effects of AMPK activation 
unrelated to aging have detrimental physiological consequences 
(Burkewitz et al., 2014). Identifying downstream targets and pro- 
cesses regulated by AMPK that specifically mediate its role in 
longevity would therefore enhance our capacity to harness the 
link between energetics and aging for treatment of age-related 
pathologies. 

Previously, we identified the cyclic AMP-responsive element 
binding protein (CREB)-regulated transcriptional coactivator 
(CRTC)-I as a critical longevity target of AMPK and calcineurin 
in C. elegans (Mair et al., 2011). AMPK and calcineurin antago- 
nistically regulate CRTC-1 phosphorylation status, thereby 
modulating its activity and effect on aging. CRTCs are transcrip- 
tional coactivators first discovered in mammals for their ability to 
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bind CREB and regulate its transcriptional activity (Altarejos and 
Montminy, 2011). Mammals possess 3 CRTC family members 
that act in distinct tissues, including neurons (CRTC1/2), liver 
(CRTC2), and adipose tissue (CRTC3), and aberrant regulation 
of CRTCs is implicated in multiple chronic diseases, including 
obesity, metabolic disease, and neurodegeneration (Altarejos 
and Montminy, 2011). C. elegans possess a single, highly 
conserved CRTC family member, CRTC-1. AMPK phosphory- 
lates CRTC-1 directly, promoting 14-3-3 binding, cytosolic 
sequestration and inactivation. Blocking phosphorylation of 
CRTC-1 at conserved AMPK target sites, serines 76 and 179, 
renders it refractory to AMPK regulation and constitutively 
nuclear. CRTC-1 blocks lifespan extension by AMPK 

activation or inhibition of the corresponding phosphatase calci- 
neurin (Mair et al., 2011). 

In this study we demonstrate that CRTC-1 specifically medi- 
ates the longevity output of AMPK. We perform transcriptomic 
analysis to elucidate genes downstream of AMPK/CRTC-1 
signaling, which couple specifically to lifespan regulation. 
Through this approach we identify coordination of mitochondrial 
metabolism by CRTC-1 and the nuclear hormone receptor NHR- 
49, a functional PPARa ortholog (Van Gilst et al., 2005). Notably, 
we demonstrate that these opposing transcriptional effectors 
act in the nervous system to regulate both longevity and sys- 
temic changes in metabolic transcription. NHR-49 is required 
for AMPK/calcineurin-mediated longevity, and limiting NHR-49 
function to neurons is sufficient to mediate both longevity and 
regulation of metabolic genes in peripheral tissues. In addition, 
we demonstrate that neuronal CRTC-1 modulates AMPK/calci- 
neurin-mediated longevity cell nonautonomously via regulation 
of the neurotransmitter/hormone octopamine. Neuron-specific 
activation of CRTC-1 , like nhr-49 loss, suppresses AMPK/calci- 
neurin-mediated longevity and upregulates expression of key 
enzymes involved in octopamine synthesis. Correspondingly, 
neuronal CRTC-1 has no effect on longevity in mutants 

deficient in octopamine synthesis. Together these data chal- 
lenge the current paradigm that AMPK, CRTC-1 and NHR-49 
act cell autonomously to regulate metabolism and longevity, 
and instead highlight their distinct role in communicating percep- 
tion of energy status in neurons to systemic regulation of meta- 
bolism and lifespan. 

RESULTS 

CRTC-1 Is Specific to AMPK Longevity 

We generated a C. elegans transgenic strain expressing a 
truncated AMPK a.2 catalytic subunit (AAK-2), which results in 
increased T172 phosphorylation and constitutively active (CA) 
AMPK (Mair et al., 2011). CA-AAK-2 increases C. elegans 
lifespan (Figure 1A), yet induces detrimental pleiotropic side 
effects including small body size and reduced reproductive 
capacity (Figures IB and 1C). As shown previously, CRTC- 
.|S76A,si79A apQijghes lifespan extension from AAK-2 activation 
(Figure ID) without altering AMPK activity (Figure SI). In 
contrast to its role in longevity, CRTC-1 does not 

suppress CA-AAK-2-mediated effects on growth (Figure IE) 
or reproductive period (Figure IF, Figure SI). The physiological 
effects of CRTC-1 in CA-AAK-2 animals are therefore 



specific to longevity and do not extend to non-aging-related 
traits. 

Increased longevity is often coupled to increased stress resis- 
tance. To determine if this was the case for AMPK, we examined 
the effect of CA-AAK-2 on resistance to heat stress at 33°C. 
C. elegans lacking aak-2 are sensitive to heat stress compared 
to wild-type (Apfeld et al., 2004). Conversely, activation of 
AMPK promotes heat resistance, as C. elegans expressing 
CA-AAK-2 show a 125% increase in median survival at 33“C 
compared to wild-type animals (Figure 1 G; 27 and 1 2 hr, respec- 
tively). However, unlike its effect on AMPK longevity (Figure 1 D), 
CRTC-1 does not suppress heat resistance conferred 

by CA-AAK-2 (Figure 1H). Mechanisms that protect CA-AAK-2 
animals from heat stress are thus separable from those which 
promote lifespan extension, and CRTC-1 represents a molecular 
switch that uncouples the longevity effects of AMPK from pleiot- 
ropy unrelated to aging (Figure 1 1). 

AMPK and CRTC-1 Coordinate Mitochondrial 
Metabolism to Regulate Longevity 

We reasoned that CRTC-1 could be leveraged to identify the 
mechanisms by which AMPK increases lifespan. As CRTC- 
iS7®a,si79a suppresses only the longevity effects of AMPK acti- 
vation, genes differentially expressed in CA-AAK-2 animals in a 
CRTC-1 -dependent manner would be enriched for functions 
specific to AMPK-mediated longevity (Figure II). We defined 
CRTC-1 -dependent genes as those differentially regulated in 
CA-AAK-2 (Figure 2A, yellow), or in CRTC-1 CA- 

AAK-2 double-transgenics (Figure 2A, red), but not in both (Fig- 
ure 2A, orange). 

We performed RNA-Seq analyses and identified 1 ,680 
genes differentially expressed in worms with activated AAK-2, 
activated CRTC-1 or double-transgenics, relative to 

wild-type animals (Figures 2B, S2A, Table S2). AMPK induces 
small body size, reduced reproduction, and stress resistance 
independent of CRTC-1 (Figure 1), thus we predicted that 
AMPK-dependent/CRTC-1 -independent genes (Figure 2B, 
orange region) would be associated with these phenotypes. 
Supporting this hypothesis, the gene ontology (GO) terms most 
enriched among those genes include processes involving germ- 
line differentiation, growth/development, and reproduction (Fig- 
ure 2C, Table S3). Importantly, this tight association between 
phenotypes and functional enrichments within the transcrip- 
tomic changes validated our hypothesis that CRTC-1 could be 
used to filter out pleiotropic effects of AMPK activation unrelated 
to aging. 

To identify processes specifically coupled to longevity, we 
focused on transcriptional changes induced by AMPK that are 
dependent on CRTC-1 activation status (Figure 2A). Cf the 869 
genes differentially expressed by AAK-2 activation in a CRTC-1 - 
dependent manner, over 75% are differentially expressed when 
both AMPK and CRTC-1 are active (Figure 2B, red region). These 
genes are highly enriched for processes associated with meta- 
bolism, and more specifically, processes localized to mitochon- 
dria (Figure 2D, Table S3). We examined the directionality of 
CRTC-1 -dependent gene expression changes, and found that 
suppression of AMPK longevity is associated with a broad down- 
regulation of mitochondrial metabolic processes (Figure 2D). 
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Figure 1. CRTC-1 Uncouples AMPK^Mediated Longevity from Other Pleiotropic Phenotypes 

(A) AMPK activation (CA-AAK-2) extends lifespan in a wild-type C. elegans background. See Table SI for lifespan statistics. 

(B) CA-AAK-2 suppresses growth. Image of 5 wild-type (left) and 5 CA-AAK-2 (right) adult worms. 

(C) Eggs laid per worm over successive 1 2 hr periods. Data are mean ± SD for n = 25-30 animals; * denotes p < 1 0“'^ via t test. 

(D) CA-AAK-2 fails to promote longevity in a mutant background. 

(E) CRTC-1 does not suppress AMPK-mediated effects on growth. 

(F) Eggs laid per worm over successive 1 2 hr periods (data are mean ± SD for n = 1 9-30 animals; * denotes p < 1 0“"* via t test. 

(G) and (H) Survival curves of AMPK and CRTC-1 transgenic animals exposed to heat stress at33°C. n = 60-100 worms, p < 10"® via Log-rank (Mantel- 

Cox) analysis. 

(I) CRTC-1 is a longevity-specific AMPK target that uncouples growth, reproduction, and stress resistance from lifespan extension. 



To determine whether the transcriptional changes in metabolic 
genes ultimately alter metabolic function we performed metabo- 
lomic analyses. We measured organic acids, amino acids, and 
acylcarnitines, which represent metabolites of the major energy 
producing pathways, in CA-AAK-2 animals with and without 
CRTC-1 s76a,si79a there are few significant differences be- 
tween CA-AAK-2 and CA-AAK-2 in amino 

acids or acyicarnitines (Figures S2C and S2D, Table S4), the 
data show widespread differential regulation of TCA cycle inter- 
mediates (Figure 2E). Congruent with the changes observed at 
the transcriptional level, the TCA cycle intermediates exhibit a 
pattern consistent with altered mitochondrial metabolism being 
causal to AMPK/CRTC-1 regulation of longevity. Namely, TCA 
intermediates and associated organic acid levels are maintained 
or increased by AMPK activation, CRTC-1 opposes 

these effects for several organic acids, including malate, citrate, 
and lactate (Figure 2E). These metabolomic data support a role 



for AAK-2 and CRTC-1 in coordinating central metabolic pro- 
cesses and highlight a new role for CRTCs in mediating tran- 
scriptional links between AMPK status and mitochondrial meta- 
bolism. Although AMPK is a known sensor and regulator of 
mitochondrial function and biogenesis (Flardie et al., 201 2), these 
data now specifically couple these processes to the role of 
AMPK in longevity assurance. 

Transcriptional Regulation of Metabolism Is Required 
for AMPK and Calcineurin Longevity 

To determine if the metabolic effects of CRTC-1 are causal to 
AMPK longevity, we searched for known alternative interven- 
tions that broadly regulate cellular metabolic processes to 
examine their effects on lifespan. Like AMPK, the nuclear 
hormone receptor and functional PPARa ortholog, NFIR-49, ac- 
tivates during low energy status such as fasting, and transcrip- 
tionally promotes genes required for mitochondrial function 
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Figure 2. Leveraging CRTC-1 to Identify Longevity-Specific Processes Downstream of AMPK Reveals a Critical Role for Mitochondrial 
Metabolism 

(A) Schematic Venn diagram illustrating how CRTC-1 -specific gene expression filters AMPK-induced changes associated with longevity from those involved in 
other pleiotropic phenotypes. 

(B) Venn diagram representing the number of differentially expressed (DE) genes identified through RNA-Seq analysis from each indicated genetic background 
relative to wild-type controls. See Table S2 for complete list of DE genes. 

(C and D) Clusters of enriched GO biological processes and cell compartments among the DE genes involved in CRTC-1 -independent phenotypes {C, orange) or 
unique to the CA-AAK-2 worms (D, red). Bars represent the percentage of genes within that category that are up- (orange) or downregulated 

(blue). The number of genes annotated within a cluster is tabulated, along with the smallest multiple-testing corrected p value for the observed enrichment 
attributed to a term within the cluster. See also Figure S2 and Table S3. 

(E) Metabolomic analyses of transgenic strains to measure levels of organic acids (E), acylcarnitines, and amino acids (Figure S2). Two-way ANOVAs were 
performed with a Sidak multiple comparisons test after metabolite levels were normalized to total protein. Data are mean ± SEM of n = 2-5 replicates per 
metabolite in each group, ^p < 0.05 versus CA-AAK2, ^p < 0.05 versus CRTC-1®^®*’^^^®*, ^p < 0.05 versus cRTC-1®^®*’®^^^*; CA-AAK-2. See also Table S4. 
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Figure 3. NHR-49 Is Required for AMPK- and 
Calcineurin-Mediated Longevity 

(A) Heat map of genes differentially expressed in 

CA-AAK-2 and nhr-49(nr2041 ) 
worms, demonstrating significant overlap in gene 
expression patterns. See also Table S5. 

(B) Mean mRNA expression levels (average log 2 of 

fold change relative to wild-type worms) of 29 
metabolic genes. In a CA-AAK-2 background, 
native {crtc-1 promoter) corre- 

lates with whole-organism loss of NHR-49 (Pear- 
son correlation coefficient, r), validating the com- 
parison made in (A). Note r = 0.66; p < 0.0001 after 
1 0% winsorization of strong outliers. See Table S6. 

(C) and (D) CA-AAK-2 extends lifespan in a wild- 
type background (C), but not in the absence of 
NHR-49 function (D). See Table SI for lifespan 
statistics. 

(E) and (F) tax-6 RNAi extends lifespan in a wild- 
type background (E), but not in the absence of 
NHR-49 function (F). The genetic background in B - 
F is noted next to the origin. 





(Pathare et al., 2012). We compared genes differentially ex- 
pressed in CRTC-1 CA-AAK-2 transgenic animals 

with genes previously reported to be differentially expressed in 
nematodes lacking functional nhr-49 (Pathare et al., 2012). Of 
the 47 NHR-49-dependent genes identified by Pathare et al., 
we observe 30% overlap (X^ = 13.45, p = 10“®), suggesting 
that NHR-49 and CRTC-1 coordinately regulate shared meta- 
bolic targets (Figure 3A; Table S5). We selected a candidate 
list of 29 metabolic genes, including genes regulated by 
CRTC-1 from our RNA-Seq dataset and by NHR-49 from previ- 
ously published data (Table S6), and validated the high degree of 
correlation between CRTC-1 activation and NHR-49 loss of 
function in a CA-AAK-2 background relative to wild-type (Fig- 
ure 3B; p< 0.01). When AMPK is activated, loss of nhr-49 mirrors 
activation of CRTC-1 to promote a transcriptional reprogram- 
ming of metabolic genes. 

If transcriptional regulation of metabolism by CRTC-1 is causal 
to AMPK lifespan extension, we hypothesized that AMPK should 
also fail to promote longevity in nhr-49 mutants, since they reca- 
pitulate similar transcriptional changes in metabolic processes. 
In support of this hypothesis, an nhr-49(nr2041) deletion allele 



suppresses lifespan extension via CA- 
AAK-2 (Figures 3C and 3D). Our previous 
work established that AMPK and calci- 
neurin mediate longevity through a shared 
signaling pathway that converges on 
CRTC-1 . Calcineurin, a protein phospha- 
tase, directly opposes AMPK by de- 
phosphorylating and activating CRTC-1. 
RNAi-mediated knockdown of tax-6, 
the catalytic subunit of calcineurin, 
mimics AMPK activation by increasing 
C. elegans lifespan in a CRTC-1 -depen- 
dent manner (Mair et al., 2011) and 
additionally activates expression of the 
NHR-49-dependent target, acs-2 (Fig- 
ure S3). Strikingly, tax-6 RNAi also requires intact NHR-49 
function to promote longevity (Figures 3E and 3F). These results 
suggest that AMPK/calcineurin signaling promotes a shift in 
metabolic programs by orchestrating the activity of opposing 
transcriptional effectors, CRTC-1 and NHR-49, and that this 
metabolic switch is required for longevity. 

CRTC-1 Acts through CREB in Neurons to Mediate 
Longevity 

Both aak-2 and nhr-49 are expressed ubiquitously (Mair et al., 
2011; Van Gilst et al., 2005) and are believed to function as 
cell-autonomous regulators of metabolism. In contrast, crtc-1 
expression is limited to the intestine and neurons in C. elegans 
(Mair et al., 2011). We reasoned that CRTC-1 may directly regu- 
late transcription of genes involved in metabolism in the intestine, 
one of the major organs of cellular metabolic activity and fat stor- 
age in C. elegans, and that this effect may be sufficient to system- 
ically modulate longevity. To test this hypothesis, we expressed 
CRTC-1 from the ges-1 promoter, limiting its expres- 

sion to intestinal cells. Surprisingly, intestinal expression of 
CRTC-1 has no effect on tax-6 RNAi-mediated 
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longevity (Figures 4A and 4B). Ruling out the intestine as the site 
of action, we examined the role of neuronal CRTC-1 in AMPK/cal- 
cineurin-mediated longevity. Expressing CRTC-1 from 

the pan-neuronal rab-3 promoter (Figure 4C, inset) fully sup- 
presses lifespan extension by both tax-6 RNAi (Figure 4C) and 
AAK-2 activation (Figures 4D and 4E). Upon finding that selective 
CRTC-1 activation in neurons is sufficient for its effects on life- 
span, we asked whether activating AMPK in select tissues is 
also sufficient to promote longevity. We expressed CA-AAK-2 
from pan-neuronal, muscle, and intestinal-specific promoters. 
AMPK activation is not sufficient for longevity in any of the individ- 
ual tissues tested (Figure S4A), suggesting this longevity mecha- 
nism in C. elegans requires local AMPK-mediated programming 
of mitochondrial function in multiple tissues. Taken together 
these results indicate that CRTC-1 activity in neurons cell nonau- 
tonomously modulates AMPK/calcineurin-mediated longevity. 
Moreover, signals downstream of neuronal CRTC-1 dominantly 
override the effects AMPK exerts locally in peripheral tissues. 

CRTCs lack DNA-binding activity and depend on partner tran- 
scription factors for recruitment to DNA in order to stimulate gene 
transcription (Altarejos and Montminy, 201 1). Though first identi- 
fied as CREB modulators, CRTCs also bind and regulate other 
bZIP transcription factor family members. To determine whether 
the effects of neuronal CRTC-1 on aging occur via CREB, we 
tested whether CREB was necessary for longevity suppression 
by CRTC-1 In animals lacking CRFI-1, the C. elegans 

CREB ortholog, CRTC-1 expressed under its endoge- 

nous promoter no longer suppressed CA-AAK-2-mediated 
longevity (Figures 4F and 4G). Additionally, neuronal expression 
of CRTC-1®^®'^'®^^®'°‘ in crh-1 null animals had no effect on life- 
span extension by tax-6 RNAi (Figures 4FI and 4I). Lifespan 
extension via AMPK/calcineurin therefore requires inhibition of 
the CRTC-1 /CRFI-1 transcriptional complex in neurons. 

Neuronal CRTC-1 Cell Nonautonomously Regulates 
Metabolic Genes 

Since CRTC-1 regulates longevity through a systemic metabolic 
program (Figure 2) and neuron-limited CRTC-1 activation is suf- 
ficient to suppress longevity, we sought to determine if neuronal 
CRTC-1 is sufficient to produce the lifespan-related effects on 
metabolic transcription. Analyzing the same panel of 29 meta- 
bolic genes used previously (Figure 3B), we found that limiting 
CRTC-1®^®'^’®^^®'°‘ expression to neurons recapitulates the ef- 
fects on peripheral metabolic genes seen in animals expressing 
CRTC-1®''®'^'®^^®'°‘ from its native promoter (Figure 4J; p < 
0.0001). Therefore, similarly to lifespan, CRTC-1 regulates meta- 
bolism cell nonautonomously from neurons. Finally, we deter- 
mined that neuron-specific activation of CRTC-1 mimics genetic 
deletion of nhr-49 regarding peripheral expression of genes 
involved in cellular metabolism (Figure 4K; p < 0.01). Despite 
the presence of CA-AAK-2 in every background, AAK-2 acti- 
vated animals alone exhibit no correlation in gene expression 
with any of the double transgenics, indicating these transcrip- 
tional effects are all attributable to NFIR-49 and CRTC-1 (Figures 
S4B-S4D; p > 0.05). Taken together, the striking overlap 
between gene expression profiles (Figure 4L; Table S6) reveals 
an antagonistic relationship between metabolic programs regu- 
lated by NHR-49 and neuronal CRTC-1 . 



NHR-49 Regulates Metabolism and Lifespan Cell 
Nonautonomously 

NHR-49 has previously been characterized as a functional ortho- 
log of mammalian PPARa, as nhr-49 mutants fail to activate 
mitochondrial fatty acid oxidation (FAO) genes during starvation 
(Van Gilst et al., 2005). However, whether this effect is cell auton- 
omous is unknown, as the NHR-49 DNA-binding motif has not 
been determined and direct targets of NHR-49 remain elusive. 
To determine whether NHR-49 can mediate lifespan and meta- 
bolism cell nonautonomously, we selectively restored NHR-49 
function to neurons in an nhr-49 null background by driving 
expression with the rab-3 promoter. As previously described, 
C. elegans subjected to 24 hr of fasting show strong upregulation 
of a key gene involved in beta-oxidation, acyl-CoA synthetase 
(ACS)-2, which is dependent on nhr-49 (Figure 5A). Supporting 
a role for NHR-49 beyond cell-autonomous regulation of meta- 
bolic gene expression, neuronal rescue of NHR-49 is sufficient 
to restore both basal acs-2 expression (Figure 5A) and the induc- 
tion of acs-2 by fasting (Figure 5B). Since mRNA measurements 
were obtained using whole-animal preparations, we generated 
animals expressing GFP under control of the acs-2 promoter 
to examine tissue-specific induction. Fasting for 24 hr induces 
expression of GFP in multiple tissues, including pharynx, mus- 
cle, and intestine (Figure 5C). This induction is abrogated by 
loss of nhr-49 (Figure 5D). Remarkably, neuronal expression of 
nhr-49 is sufficient to restore induction of acs-2 in both neurons 
and distal tissues, including muscle and intestine, but not 
pharynx (Figure 5E). 

Induction of acs-2 expression in multiple tissues lacking 
NHR-49 suggests alternative transcription factors might 
respond to signals downstream of neuronal CRTC-1 . To identify 
transcriptional regulators in this pathway, we revisited our RNA- 
Seq dataset and performed Hypergeometric Optimization of 
Motif EnRichment (HOMER) analysis of the ‘ORTO-1 -dependent 
genes’ (Figure 2B, red section). We identified a motif with 
consensus TGAT/\AOG or CGTTATOA enriched in the putative 
promoter regions of genes downstream of AMPK/ORTO-1 
(p = 1e-25, found in 38% of targets versus 21 % of background) 
(Figure S5A). The motif strongly resembles the GATA-like DAE 
(DAF-1 6 Associated Element), recently identified as the binding 
site for POM-1 (Tepper et al., 2013). modENCODE OhIP-seq 
data also suggests that both DAF-1 6 and PQM-1 bind the acs- 
2 promoter directly. To determine whether DAF-1 6 can regulate 
expression of acs-2, we examined the effect of fasting on our 
acs-2P::GFP reporter under control conditions, and in animals 
subjected to RNAi for daf-16, along with two other transcription 
factors known to mediate DR longevity, skn-1 and pha-4. While 
inhibition of skn-1 and pha-4 does not alter induction of acs-2 
by fasting, daf-16 RNAi significantly suppresses acs-2 induction 
(Figures S5B and S5C). These data therefore suggest that DAF- 
16 and/or PQM-1 might be acting downstream of the neuronal 
CRTC-1 signal to modulate metabolic gene expression. 

Given the ability of NHR-49 to cell nonautonomously regulate 
metabolic genes, we next asked whether, like CRTC-1 , NHR-49 
mediates longevity through its effects in neurons. Restoring 
NHR-49 function in neurons exclusively was sufficient to restore 
lifespan extension via both tax-6 RNAi (Figures 5F-5H) and 
AMPK activation (Figures 5I-5K) in nhr-49 deletion mutants. 
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Figure 4. Neuronal CRTC-1/CRH-1 Activation Suppresses Longevity Downstream of AMPK and Calcineurin Signaling and Cell Non- 
autonomously Regulates Metabolic Transcription 

(A) tax-6 RNAi increases longevity. See Table S1 for lifespan statistics. The genetic background in A - I is noted next to the origin. 

(B) Intestine-specific {ges-1 promoter) fails to suppress tax-6 RNAi longevity. Inset: image of intestine-specific tdTOMATO-tagged CRTC- 

-,S76A,S179A. ^ qq 

(C) Neuronal-specific {rab-3 promoter) CRTC-1®^®^’®^^®^ fully suppresses tax-6 RNAi-mediated longevity. Inset: image of neuron-specific tdTOMATO-tagged 
CRTC-1®^®*'®^^^^ scale bar, 100 jim. 

(D and E) CA-AAK-2 extends lifespan in a wild-type background (D), but neuronal CRTC-1®^®^’®^^^^ suppresses AMPK-mediated longevity (E). 

(F and G) CA-AAK-2 extends lifespan and this effect is blocked by expressing CRTC-1^^®^’®^^®^ from its native promoter (F). In crh-1 null mutants, CRTC-1 
activation fails to suppress AMPK-mediated longevity (G). 

(H and I) Neuron-specific activation of CRTC-1 s76a,si79a suppresses longevity mediated by tax-6 RNAi (H), but neuronal CRTC-1 requires intact crh-1 

function to mediate its effects on aging (I). 

(J and K) Expression levels of 29 metabolic genes in CA-AAK-2 animals reveal that the effects of neuron-specific CRTC-1®^®^’®^^^^ correlate strongly with 
expression of CRTC-1 from its native promoter (J); and neuron-specific CRTC-I®^®^’®^^®*^ activation correlates with the transcriptional effects of loss of 

NHR-49 function (K). r and p values were derived by Pearson correlation. See also Table S6. 

(L) Dendrogram summarizing similarities in metabolic gene expression between AMPK, CRTC-1 and NHR-49 mutants. Strains were clustered by their pair-wise 
Pearson correlations using Ward’s minimum variance method. Vertical heights of branches indicate the degree of correlation (r; y axis). Multiscale bootstrap 
resampling p values on each branch were calculated via pvciust R package [http://www.sigmath.es.osaka-u.ac.jp/shimo-lab/prog/pvclust/]. 
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However, unlike the effects of CRTC-1 on AMPK/calcineurin- 
mediated longevity, which is specific to neurons, the role of 
NHR-49 is more complex, as intestinal rescue can also partially 
reverse suppression of tax-6 RNAi lifespan in nhr-49 deletion 
mutants and is sufficient to restore acs-2 induction exclusively 
in intestinal cells (Figures S6A and S6B). Additionally, while 
nhr-49 overexpression in intestine does not affect C. elegans 
longevity, overexpression of nhr-49 in neurons is sufficient to 
extend lifespan, further highlighting tissue-specific functions 
(Figures S6C and S6D). 

Neuronal CRTC-1 Cell Nonautonomously Regulates 
Mitochondrial Dynamics 

Recent studies of mitochondrial dynamics suggest that remodel- 
ing of the mitochondrial network itself may impact metabolic 
function (Liesa and Shirihai, 2013), and loss of NHR-49 has 
been shown to disrupt mitochondrial morphology and function 
(Pathare et al., 2012). Given that neuronal CRTC-1 and 
NHR-49 antagonistically regulate AMPK/calcineurin-mediated 
longevity and metabolic processes, we explored whether 
changes in mitochondrial architecture were involved in the 
metabolic reprogramming and longevity by AMPK/CRTC-1. To 
observe the mitochondrial network directly in distinct tissues, 
we employed nematodes expressing mRFP targeted to the outer 
mitochondrial membrane via fusion to TOM20. Typically, the 
mitochondria of muscle cells of young (day 1) adult worms are 
fused and tubular, running parallel among the myofilaments (Fig- 
ure 6A). Activation of CRTC-1 exclusively in neurons of young 
adult worms, however, results in significant fragmentation of 
the mitochondrial network in muscle cells, demonstrating a 
cell-nonautonomous role for CRTC-1 in regulating mitochondrial 
dynamics (Figure 6B). This effect is consistent with the opposing 
transcriptional effects of NHR-49 and neuronal CRTC-1 , as loss 
of nhr-49 also causes mitochondrial fragmentation and altered 
morphology (Figures 6C-6E). To quantify the degree of fragmen- 
tation in these animals, we determined the ratio of mitochondrial 
area to perimeter in muscle cells of neuronal CRTC-1®^®'^’®^^®'^ 
mutants and found a 56% reduction relative to control animals 
(Figure 6F; p < 0.0001). The area of muscle cells occupied by 
mitochondria was also decreased 30% upon neuronal CRTC-1 
activation (Figure 6G; p < 0.0001), supporting CRTC-1 -mediated 
suppression of mitochondrial function observed at the transcrip- 
tomic and metabolomic levels (Figure 2). 

CRTC-1 Mediates Lifespan via a Catecholamine Signal 

Having determined neuronal CRTC-1 mediates longevity and 
mitochondrial function cell nonautonomously, we reasoned it 
might regulate a signal that relays energy status from neurons 
to coordinate aging and metabolism in peripheral tissues. Mono- 
amine signals, e.g., dopamine and serotonin, act in nutrient- 
sensing pathways to regulate behavioral and peripheral meta- 
bolic changes conserved from nematodes to humans (Ashrafi, 
2007). We therefore examined whether monoamine signaling 
might provide a potential mechanism by which neuronal 
CRTC-1 could regulate longevity and mitochondrial dynamics. 
Monoamines are secreted via exocytosis in dense core vesicles 
(DCVs), which in C. elegans requires the Ca^'^-dependent acti- 
vator protein for secretion (CAPS), UNC-31 (Grishanin et al., 



2004) . Null mutation of unc-31 suppresses acs-2 expression, 
as determined by our acs-2P::GFP reporter strain and qRT- 
PCR (Figures S7A-S7C), suggesting secreted signals within 
DCVs may mediate distal metabolic gene regulation. 

In our RNA-Seq analysis we identified multiple enzymes 
involved in the synthesis of biogenic amines that were differen- 
tially mediated by AMPK and CRTC-1 (Figure 7A). We further 
examined the genes involved in monoamine synthesis by qRT- 
PCR and identified a catecholamine biosynthetic enzyme, tyra- 
mine beta-hydroxylase (TBH)-I, among the genes regulated by 
CRTC-1 that couples to AMPK longevity (Table S2). Downregu- 
lation of tbh-1 expression by AMPK activation is attenuated by 
CRTC-1®^®'^’®^^®'°‘ (Figure 7B). Acting in a pathway with tyrosine 
decarboxylase (TDC)-I, TBH-1 catalyzes the synthesis of 
octopamine, which functions as the invertebrate noradrenaline 
equivalent (Figure 7C). Analysis of the fdc- 7 and tbh-1 promoters 
revealed putative cAMP response elements (not shown), sug- 
gesting that CRTC-1 /CRH-1 may directly regulate transcription 
of these genes. As previously reported, expressing GFP down- 
stream of the tbh-1 promoter results in expression exclusively 
in two octopaminergic RIC neurons (Alkema et al., 2005), notably 
CRTC-1 also localizes to RIC neurons (Figure 7D) and thus 
may be capable of directly regulating octopamine signaling. 
We therefore hypothesized that octopamine may play a role in 
intercellular AMPK/CRTC-1 longevity signaling. 

If octopamine mediates neuronal CRTC-1 signaling to other 
tissues, we reasoned it might be sufficient to generate the mito- 
chondrial and metabolic phenotypes observed upon neuronal 
CRTC-1 activation. To test this hypothesis, we cultured nema- 
todes expressing the mitoRFP reporter in the presence of exog- 
enous octopamine. Strikingly, octopamine treatment elicits a 
similar degree of mitochondrial fragmentation in C. elegans mus- 
cle to that observed when CRTC-1 is activated neuronally, 
further suggesting that octopamine signaling may mediate 
CRTC-1 regulation of metabolism to peripheral tissues (Figures 
7E-7G). In direct support of octopamine as a relay signal, tax-6 
RNAi increases expression of acs-2, but this effect is blunted 
in animals lacking TDC-1 or TBH-1 (Figure S7D). 

To define the functional requirement of octopamine signaling 
in the modulation of aging by the AMPK/CRTC-1 /NHR-49 
pathway, we tested whether the suppression of longevity by 
neuronal CRTC-1 or nhr-49 deletion required either 

TDC-1 or TBH-1. As shown previously, AMPK activation robustly 
increases lifespan of wild-type animals and this effect is 
suppressed in worms expressing neuronal CRTC-1®^®'^’®^^®'^ 
(Figure 7H) or nhr-49(nr2041) (Table SI). While suppression of 
lifespan by nhr-49 deletion is independent of octopamine 
signaling (Figures S7E and S7F), the ability of neuronal CRTC- 
.|S76 a,si79a^q suppress longevity is completely abolished in ani- 
mals harboring null mutations in either tdc-1 (Figure 71) or tbh-1 
(Figure 7J), both of which lack octopamine (Alkema et al., 

2005) . Confirming these findings, neuronal CRTC-1®^®^'®^^®'^ 
suppresses longevity mediated by tax-6 RNAi (Figure 7K), and 
this suppression requires TDC-1 (Figure 7L) and TBH-1 (Fig- 
ure 7M) function. The ability of CRTC-1 to regulate aging in 
C. elegans therefore requires functional octopamine signaling. 
Together these results suggest that neuronal CRTC-1 modulates 
AMPK/calcineurin-mediated longevity and metabolism via an 
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Figure 5. NHR-49 Regulates Mitochondrial Metabolism and Longevity Cell Nonautonomously from Neurons 

(A) and (B) Analysis of acs-2 transcript levels by RT-PCR in L4/young adult worms fed (A) and fasted 16 hr (B). Data are mean ± SEM of 3-4 independent 

experiments. By 1 -sample (A) or 2-sample (B) t test relative to fed wild-type animals, *** denotes p < 0.001 ; ** = p < 0.01 ; ns = p > 0.05). 

(C-E) Brightfield (left) and fluorescence (middle and right) imaging of L4-stage 16 hr fasted worms expressing GFP driven by the acs-2 promoter. (C) Fasting 

activates the acs-2 promoter ubiquitously in C. elegans (middle), and higher magnification reveals strongest expression in the intestine and pharynx (right). 

(legend continued on next page) 
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Figure 6. Neuronal CRTC-1 and NHR-49 Regulation of Mitochondrial Dynamics Mirrors Their Respective Roles in Longevity 

(A-C) Fluorescence imaging (left) and binary representations (right) of mitochondrial networks in body wall muscle cells of day 1 adult worms. Neuronal 
CRTC-1 (B) and nhr-49 loss-of-function (C) induce fragmentation of the mitochondrial network in muscle cells relative to wild-type (A). Scale bars 

represent 20 rim. 

(D and E) Quantification of neuronal crjc-1®^®asi79a pj nhr-49 loss-of-function (E) dependent mitochondrial fragmentation in a population of worms 
demonstrates loss of tubular morphology (mean ± SD of n = 3 groups of 1 0-1 7 worms; p = 0.001 by t test). 

(E) Quantification of the ratio between mitochondrial area and perimeter (Mean ± SD from muscle cells of 32 worms; p < 0.0001 by t test). 

(G) Neuronal CRTC-1 activation decreases the area of muscle cells occupied by mitochondria (Mean ± SDfrom muscle cells of 32 worms; p< 0.0001 by 

t test). 



octopamine signal. Moreover, this cell-nonautonomous signal is 
dominant over the cell-autonomous effects of AMPK in periph- 
eral tissues (Figure S7G). 

DISCUSSION 

These data challenge current thinking regarding strategies to 
translate the link between energetics and longevity for therapeu- 
tics; perception and cell-nonautonomous communication of 
energy status in neurons can override direct activation of pro- 
longevity factors in distal tissues and might therefore be targeted 
for healthy aging. Although AMPK and PPARs are both targeted 
peripherally to promote metabolic homeostasis in humans (Har- 
die et al., 2012; Wahli and Michalik, 2012), both can affect meta- 
bolism cell nonautonomously from the CNS (Bantubungi et al., 
2012; Kocalis et al., 2012; Minokoshi et al., 2004). To date, the 
relative contributions of these local and distal effects to healthy 
aging have not been explored. Here, we demonstrate a cell- 
nonautonomous role these metabolic regulators play in coordi- 
nating energetics and longevity via effects on neuronal catechol- 
amine signaling. AMPK locally and cell autonomously promotes 
remodeling of mitochondrial metabolic networks to increase 
longevity; however, AMPK must also inactivate CRTC-1 -depen- 
dent transcription in neurons to systemically and cell nonauton- 



omously generate a permissive transcriptional landscape for its 
local metabolic programming. Critically, these cell-nonautono- 
mous signals dominantly impact longevity, irrespective of 
AMPK activation and energetic state in receiving cells. Intrigu- 
ingly, both neuronal activation of CRTC-1 and octopamine sup- 
plementation promote mitochondrial network fragmentation, 
suggesting dynamics of the mitochondrial network can be 
shaped from a distance and are critical for the ability of AMPK 
to promote longevity. However, this study also raises key ques- 
tions going forward, including the sufficiency of the neuronal 
signal for longevity assurance, and how these mechanisms 
might translate to mammalian systems and therapeutics de- 
signed to promote metabolic homeostasis. Specifically, perhaps 
treatments targeting peripheral metabolic effectors to promote 
healthy aging will have reduced efficacy if cell-nonautonomous 
CNS signals remain discordant. 

Although AMPK, PPARs, and CRTCs are key peripheral meta- 
bolic regulators, all have emerging roles in neuroendocrine con- 
trol of organismal metabolism that may become dysfunctional 
with age or obesity. Early studies of CRTCs focused on regula- 
tion of glucose metabolism in the mammalian liver, but there 
are three CRTC family members in mammals, two of which 
are expressed in the nervous system. Though less studied, 
recently elucidated roles of neuronal CRTCs include regulating 



(D) nhr-49 mutants fail to activate acs-2. (E) Neuron-limited rescue of NHR-49 restores acs-2 levels in neurons (arrowhead) and peripheral tissues (arrows). Boxes 
outline areas magnified in the right panels. Scale bars represent 50 iim. 

(F-l) Survival analysis demonstrating that tax-6 RNAi (F) and CA-AAK-2 (I) extend lifespan in wild-type worms, but not worms lacking nhr-49 (G, J). Restoring NHR- 
49 function selectively to neurons via the rab-3 promoter rescues tax-6 RNAi- (H) and CA-AAK-2-mediated longevity (K). Common genetic backgrounds are 
indicated next to the origin. See also Figure S5 and Table SI for lifespan statistics. 
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Figure 7. Neuronal CRTC-1 Regulates Longevity and Mitochondrial Function Cell Nonautonomously through Catecholamine Signaling 

(A) Normalized read counts for enzymes involved in synthesis of biogenic amines from the RNA-Seq analysis (see also Table S2). 

(B) qRT-PCR validating AMPK/CRTC-1 regulation oUbh-1 transcript levels (Mean ± SEM of mRNA levels extracted from 2-4 samples of 50-100 animals; * denotes 
p < 0.05 by t test). 

(C) The biosynthetic pathway of octopamine. 

(D) Fluorescence imaging of a worm co-expressing CRTC-1 ::tdTOMATO from the native crtc-1 promoter (top left) and GFP driven by the tbh-1 promoter reveals 
CRTC-1 expression in octopaminergic RIG neurons. 

(E and F) Fluorescence imaging (top) and binary representations (bottom) of the mitochondrial network in muscle cells of animals vehicle-treated (water) or grown 
on media with 5 mM octopamine (F). 

(G) Classifying worms by their mitochondrial morphology reveals a 45% decline in the fraction of worms with tubular mitochondria treated with octopamine versus 
control (p < 0.001 by t test; mean ± SD of n = 3 samples of 1 1 -1 9 worms). 

(H-M) Survival curves demonstrating that neuron-specific activation of CRTC-1 suppresses both AMPK- (H) and calcineurin-mediated (K) lifespan 

extension. However, neuronal CRTC-1 has no effect on AMPK or calcineurin-mediated longevity in animals lacking functional tdc-1 (I, L) or tbh-1 (J, M). 

Genetic backgrounds are noted next to the origin. See Table SI for lifespan statistics. 



expression of peptide signals and metabolic homeostasis in the 
periphery. Deletion of CRTC1 , which is expressed primarily in the 
brain in mammals, results in hyperphagia and obesity in mice 
(Altarejos et al., 2008; Breuillaud et al., 2009). More recently, 



knockout of TRPV1 pain receptors was shown to both promote 
metabolic fitness and extend lifespan through effects in mouse 
sensory neurons (Riera et al., 2014). Interestingly, TRPV1 mu- 
tant mice exhibit nuclear exclusion of CRTC1 and reduced 
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expression of a neuropeptide important for reguiating glucose 
homeostasis. Although the requirement of neuronal CRTC1 inhi- 
bition for the longevity effects associated with TRPV1 loss of 
function in mice remains untested, our results point to a poten- 
tially conserved, neuronally mediated mechanism by which 
CRTCs regulate systemic metabolic homeostasis and impact 
the aging process in worms and mice. 

AMPK cell autonomously regulates numerous physiological 
processes known to play roles in aging, including autophagy, 
protein synthesis, mitochondrial biogenesis, and both lipid and 
glucose metabolism (Burkewitz et al., 2014). Our data indicate 
mitochondrial metabolism is causally associated with AMPK 
longevity. Moreover, they suggest that AMPK regulation of 
both longevity and metabolism can be divided into two compo- 
nents: acute remodeling of metabolic pathways through direct 
regulation of enzymatic activity, and long-term remodeling of 
cell function via transcriptional reprogramming. Surprisingly, 
the transcriptional effects of AMPK are induced via cell-nonau- 
tonomous signals that override local enzymatic effects; CRTC- 
1 transcription in neurons suppresses lifespan despite AMPK 
being constitutively active in all tissues. Cunningham et al. 
(201 4) recently identified a cell-nonautonomous role for neuronal 
AMPK in modulating peripheral lipid storage in nematodes, 
which supports cell-nonautonomous effects of AMPK/CRTC 
on metabolism. The role of AMPK in the central regulation of pe- 
ripheral metabolism is conserved in mammals; AMPK integrates 
hormonal signals in the hypothalamus to control energy homeo- 
stasis, satiety, and metabolism (Minokoshi et al., 2004). Further, 
in response to changes in glucose levels, AMPK regulates 
CRTC2 activity in the murine hypothalamus to modulate insulin 
signaling via IRS2 (Lerner et al., 2009). Communication between 
central and peripheral AMPK activity and its effect on metabolic 
homeostasis and aging in other organisms will be an exciting 
area for future research. While expressing truncated CA-AAK-2 
in individual tissues of C. elegans in our study failed to promote 
longevity, work in Drosophila has shown that overexpressing 
wild-type AMPK in muscle or fat body (Stenesen et al., 2013) 
or activated AMPK mutants in brain or gut (UIgherait et al., 
2014) is sufficient to extend lifespan. Differential effects seen in 
C. elegans and Drosophila may be due to methods employed 
to generate tissue- specific strains. Moving forward, more 
work is needed to better elucidate how tissue-specific roles of 
AMPK coordinate to control longevity across different model 
systems. 

Like AMPK and CRTCs, PPAR family transcription factors are 
best known for their primarily cell-autonomous roles in regulating 
metabolism, including lipid uptake, storage, and oxidation. Here, 
we demonstrate that the proposed worm PPARa, NHR-49, acts 
antagonistically to CRTC-1/CREB, regulating the shift in meta- 
bolic and mitochondrial programming, and that neuronal nhr- 
49 is sufficient for AMPK-mediated longevity (Figure 5). Notably, 
novel PPAR functions in the mammalian brain have also begun to 
emerge. The thiazolidinedione (TZD) class of anti-diabetic drugs 
is associated with weight gain, and two complementary studies 
identified brain PPARy as the critical mediator of TZD-induced 
effects on food intake, thermogenic energy expenditure, and pe- 
ripheral glucose metabolism (Lu et al., 201 1 ; Ryan etal., 201 1). In 
addition, PPARa null mice show increases in glucose turnover. 



body weight, and adipogenesis that are not rescued by restoring 
hepatic PPARot function. Pharmacologically activating PPARa in 
the brain of these mice, however, decreases glucose usage in 
peripheral tissues (Knauf et al., 2006). How PPARs and NHR- 
49 function in neurons to systemically regulate metabolic ho- 
meostasis with age remains unknown and is an important scope 
for fufure work. 

An excifing key finding of this study is the novel role of 
octopamine, the invertebrate equivalent to the catecholamine 
(nor)adrenaline, as a signal communicating energetic state be- 
tween neuronal AMPK/CRTC-1 and the periphery to modulate 
longevity (Figure 7). Interestingly, there is precedent for both 
AMPK and CRTCs in the regulation of analogous bioamine pafh- 
ways in mammals. In mice, AMPKa2 suppresses sympathetic 
catecholamine release (Viollet et al., 2003), while CRTC1 en- 
hances monoamine signaling in the prefrontal cortex (Viollet 
et al., 2003). However, it remains unclear whether octopamine 
is acting as a neurotransmitter or a neuroendocrine signal to 
mediate longevity in C. elegans. Octopamine and 5-HT were 
recently shown to act through a positive regulatory loop in neu- 
rons to promote release of an unidentified endocrine factor 
capable of activating the nuclear hormone NHR-76 to regulate 
lipid oxidation in the C. elegans intestine (Noble et al., 2013). 
That at least 2 nuclear hormone receptors (NHR-49 and NHR- 
76) act downstream of octopamine suggests that perhaps 
octopamine regulates the release of a lipophilic hormone. Alter- 
natively, given our finding that DAF-16/FOXO may be involved in 
metabolic transcription downstream of neuronal CRTC-1, oc- 
topamine may regulate the secretion of specific insulin-like pep- 
tides. Beyond serving as a signaling molecule between neurons, 
octopamine could also act as an endocrine molecule itself. 
C. elegans possesses three putative octopamine receptors, 
ser-3, ser-6, and octr-1, whose expression outside the 
nervous system has not been extensively examined. Interest- 
ingly, a small-molecule screen for drugs capable of exfending 
C. elegans lifespan identified a molecule that was shown to be 
an antagonist of the SER-3 receptor (Petrascheck et al., 2007). 
Future studies characterizing the respective roles of each octop- 
amine receptor will be enlightening in understanding how octop- 
amine elicits metabolic and longevity-related responses in the 
periphery. 

Although our studies point toward remodeling of mitochondrial 
metabolism as being required for AMPK longevity, they do not 
preclude a role for other cellular processes. AMPK and CRTCs 
are known regulators of autophagy (Egan et al., 2011; Seok 
et al., 2014), and autophagy is required for lifespan extension 
by AMPK activation in Drosophila (UIgherait et al., 2014) and 
calcineurin inhibition in C. elegans (Dwivedi et al., 2009). 
Highlighting the role of inter-tissue communication in AMPK 
longevity, tissue-specific activation of AMPK in the fly promotes 
systemic tissue homeostasis via the autophagic effector, Atgl , 
which subsequently and cell nonautonomously promotes activa- 
tion of autophagy in other tissues (UIgherait etal., 2014). DAF-16/ 
FOXO is a known regulator of autophagy and is directly regulated 
by both AMPK and calcineurin and required for their effects 
on longevity (Greer et al., 2007; Tao et al., 2013). We saw enrich- 
ment of the DAE element in genes downregulated when 
both AMPK and CRTC-1 were active, suggesting CRTC-1 might 
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remotely regulate DAF-16/FOXO activity and/or activate its 
transacting antagonist PQM-1 (Tapper et al., 2013). Understand- 
ing how neuronal CRTC-1 interacts with DAF-16 and/or PQM-1 
and if they function intra- or intercellularly to modulate AMPK/ 
calcineurin-mediated longevity will be informative. 

In summary this study highlights “mito-centric” metabolism as 
the critical target of AMPK/CRTC-mediated effects on aging, 
and establishes that neurons are the causal and crucial site for 
CRTC-1 -dependent regulation of longevity. Though both sen- 
sory perception of nutrient availability in neurons (Petrascheck 
et al., 2007) and organismal energy status (Burkewitz et al., 
2014) are known to modulate aging, our data suggest an 
emerging paradigm: the optimal pro-longevity intervention re- 
quires coordination of energy perception in the neurons with 
accurately executed metabolic programs in peripheral tissues. 
Indeed, we show here that the pro-longevity, AMPK-mediated 
metabolic program in peripheral tissues is overridden when the 
regulatory link between AMPK and CRTC-1 is broken exclusively 
in neurons, completely suppressing all gains in longevity for the 
organism. If neuronal energy-sensing mechanisms are domi- 
nant, as our data indicate, selectively targeting central sensors 
and regulators of energy homeostasis may be sufficient to 
generate a peripheral metabolic program that promotes healthier 
aging. 

EXPERIMENTAL PROCEDURES 

Additional details are provided in the Extended Experimental Procedures. 

Lifespans 

Lifespan experiments were performed on standard nematode growth media 
(NGM) at 20°C. Worms were synchronized by timed egg lays using gravid 
adults. When the progeny reached adulthood (~72 hr), 100 worms were 
transferred to fresh plates at 10-25 worms per plate and this was considered 
time = 0. Worms were transferred to fresh bacterial lawns every other day until 
the first deaths {1 0-1 4 days). Survival was scored every 1 -2 days and a worm 
was deemed dead when unresponsive to 3 taps on the head and tail. Worms 
were censored due to contamination on the plate, leaving the NGM, eggs 
hatching inside the adult or loss of vulval integrity during reproduction. Only 
in the lifespans noted (TABLE SI), 5-Fluoro-2'-deoxyuridine (FUDR) was 
added to media to prevent excessive censoring. FUDR (100 |.il: 1 mg ml“^) 
was added 24 hr before picking worms to the plate on the first day of adult- 
hood, and worms were transferred off FUDR-containing plates once reproduc- 
tion had ceased (7 day), after which the assays continued normally. 

RNA Sequencing 

The experiment was performed with three biological replicates. Eggs were 
synchronized to LI larvae overnight in M9 and 1,000 larvae were grown to 
L4 on NGM seeded with OP50-1 E. coli. Animals were collected and washed 
with M9 media to remove bacteria. Worms were then snap frozen in liquid ni- 
trogen. RNA was extracted by five freeze/thaw cycles in Qiazol then purified by 
RNeasy mini kit (QIAGEN). RNA quality was checked using an Agilent Technol- 
ogies 21 00 Bioanalyzer. All samples had an RNA integrity number of 1 0. cDNA 
libraries were prepared from 4 ^igs of total RNA using the TruSeq RNA Sample 
Preparation v2 kit (lllumina). See Extended Experimental Procedures for more 
details of the data analysis. 

Metabolomics 

Synchronized LI larvae were grown to L4 on NGM/OP50-1 before being 
washed off plates with M9, resuspended in 0.6% formic acid, snap-frozen, 
and thawed immediately before lysis by sonication. Aliquots were taken for to- 
tal protein quantification, then an equal volume of acetonitrile was added to 
reach a final concentration of 0.3% formic acid and 50% acetonitrile. Samples 



were then subject to metabolomic analysis as detailed in Extended Experi- 
mental Procedures. 

Mitochondrial Analysis 

Mitochondria were analyzed in muscle cells from > 1 0 d1 adult worms per ge- 
notype. Qualitative assessment of mitochondrial morphology was made by 
scoring worms based on three categories: tubular (interconnected mitochon- 
drial network), intermediate (combination of interconnected network and iso- 
lated smaller mitochondria) or fragmented (mostly fragmented mitochondria). 
Quantitative assessments of percent mitochondrial coverage of the cell and 
mitochondrial area/perimeter ratio were made by measuring >30 muscle cells 
per genotype using a macro for ImageJ, as previously described (Dagda et al., 
2009). 

ACCESSION NUMBERS 

The GEO accession number for the RNA-seq dataset in this paper is 
GSE58931 . 
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Supplemental Information includes Extended Experimental Procedures, seven 
figures, and seven tables and can be found with this article online at http://dx. 
doi.org/10.1016/j.cell.2015.02.004. 
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SUMMARY 

Homologous recombination (HR) mediates the ex- 
change of genetic information between sister or ho- 
mologous chromatids. During HR, members of the 
RecA/Rad51 family of recombinases must somehow 
search through vast quantities of DNA sequence to 
align and pair single-strand DNA (ssDNA) with a 
homologous double-strand DNA (dsDNA) template. 
Here, we use single-molecule imaging to visualize 
Rad51 as it aligns and pairs homologous DNA 
sequences in real time. We show that Rad51 uses a 
length-based recognition mechanism while interro- 
gating dsDNA, enabling robust kinetic selection 
of 8-nucleotide (nt) tracts of microhomology, which 
kinetically confines the search to sites with a high 
probability of being a homologous target. Successful 
pairing with a ninth nucleotide coincides with an 
additional reduction in binding free energy, and sub- 
sequent strand exchange occurs in precise 3-nt 
steps, reflecting the base triplet organization of the 
presynaptic complex. These findings provide crucial 
new insights into the physical and evolutionary 
underpinnings of DNA recombination. 

INTRODUCTION 

Homologous recombination (HR) is ubiquitous among all three 
kingdoms of life and serves as a driving force in evolution. HR 
is a major pathway for repairing DNA double-strand breaks 
(DSBs) and single-strand DNA (ssDNA) gaps and plays essential 
roles in repairing stalled or collapsed replication forks (Heyer 
et al., 2010; San Filippo et al., 2008). HR provides an alternative 
pathway for telomere maintenance (Eckert-Boulet and Lisby, 
2010), can lead to the duplication of long regions of chromo- 
somes (Smith et al., 2007), and some organisms utilize HR as 
the sole means of initiating DNA replication (Hawkins et al., 
2013). HR also generates genetic diversity and ensures proper 
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chromosome segregation during meiosis (Neale and Keeney, 
2006) and is a major source of phenotypic variation in many 
organisms (Fraser et al., 2007; Hastings et al., 2009). In hu- 
mans, aberrant HR underlies chromosomal rearrangements 
often associated with cancers, cancer prone syndromes, and 
numerous genetic diseases (Heyer et al., 2010; San Filippo 
et al., 2008). 

DSB repair in Saccharomyces cerevisiae has long served 
as paradigm for studying HR (Heyer et al., 2010; San Filippo 
et al., 2008). The DNA ends present at DSBs are first processed 
by 5' ^ 3' strand resection, yielding 3' ssDNA overhangs whose 
production coincides with the binding of replication protein A 
(RPA). RPA is then replaced by Rad51 or the meiosis-specific 
recombinase Dmcl, which is thought to have arisen by a 
gene duplication event early in the evolutionary history of 
eukaryotes (Lin et al., 2006). Rad51 and Dmcl are both closely 
related to Escherichia coii RecA. These proteins are DNA- 
dependent ATPases that form right-handed helical filaments 
on ssDNA, and the resulting presynaptic complexes (PCs) 
display a striking degree of conservation from bacteriophage 
to humans (Bianco et al., 1998). Structural studies have re- 
vealed that the presynaptic ssDNA is organized into base trip- 
lets that are maintained in near B-form conformation, but there 
is a 7.8 A rise between adjacent triplets causing an overall 
extension of the ssDNA (Chen et al., 2008). Single-molecule 
force measurements suggest that this ssDNA extension may 
promote release of nonhomologous double-strand (dsDNA) 
and facilitate strand exchange with homologous dsDNA (Danilo- 
wicz et al., 2014). Many proteins participate in HR, including 
those encoded by the conserved RAD52 epistasis group of 
genes (Heyer et al., 2010; San Filippo et al., 2008). Despite these 
layers of complexity, Rad51 , like other members of the Rad51/ 
RecA family, can promote strand invasion in the absence of 
other proteins, implying that more specialized accessory factors 
augment the basal recombinase activities without conferring 
new catalytic properties. 

Rad51/RecA recombinases must align ssDNA with a ho- 
mologous duplex elsewhere in the genome. This process is 
referred to as the “homology search” and it is conceptually 
similar to target searches conducted by all other site-specific 
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Figure 1. Visualizing dsDNA Capture by 
Rad51 

(A) Schematic of Rad51-ssDNA curtains. 

(B) Strategy for detecting binding of Atto565- 
labeled dsDNA to the PCs. 

(C) Wide-field image of Rad51 PCs bound to 
Atto565-DNAi.o. 

(D and E) Binding site distribution (D) and pair-wise 
distance distribution (E) of Atto565-DNAi o- 

(F) Kymograph showing dissociation Atto565- 
DNA 1.0 from a single Rad51 PC; 1 00-msec frames 
were collected at 20-s intervals. 

(G) Dissociation kinetics of Atto565-DNAi.o. Un- 
less otherwise stated, error bars for all binding site 
distributions and survival probability plots repre- 
sent 70% confidence intervals obtained through 
bootstrap analysis. 

See also Figures S1 , S2, S3, S4, and S6. 



align homologous DNA sequences are 
broadly conserved among the RadSV 
RecA family members. This mechanism 
can drastically reduce the amount time 
necessary to align homologous dsDNA 
sequences. 
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DNA-binding proteins (Barzel and Kupiec, 2008; Renkawitz 
et al., 2014; von Hippel and Berg, 1989). The principles that 
govern sequence alignment during HR remain poorly under- 
stood because the corresponding intermediates are transient 
and asynchronous (Barzel and Kupiec, 2008; Renkawitz et al., 
2014). What features are the recombinases searching for within 
dsDNA? How do they distinguish between nonhomologous and 
homologous sequences? Over what length scales do they test 
for homology? What distinguishes search intermediates from 
the commitment to strand exchange? These questions all 
pertain to the overarching issue of how homology is efficiently 
located given the vast sequence space encoded by the genome 
(Neale and Keeney, 2006). We sought to address these ques- 
tions by visualizing the homology search at the single-molecule 
level. Our results lead to a model in which 8-nt microhomology 
motifs serve as the fundamental units of molecular recognition 
by S. cerevisiae RadSI, and this initial event is distinct from 
subsequent strand invasion. We show that the physical 
principles underlying the ability of RadSI to search for and 



Assembly of RadSI Presynaptic 
Complexes 

We used ssDNA curtains and total inter- 
nal refection fluorescence microscopy 
(TIRFM) to visualize RadSI PCs (Gibb 
et al., 2014a). The ssDNA was generated 
using M13mp18 (7,249-nt) as a template 
for rolling circle replication (Figures 1A 
and SI) and then anchored to a lipid 
bilayer within a microfluidic chamber 
through a biotin-streptavidin linkage and 
aligned along chromium (Or) barriers by application of hydrody- 
namic force. The ssDNA unravels when incubated with RPA- 
eGFP, and the downstream ends of the RPA-ssDNA are 
anchored to exposed Cr pedestals. Addition of wild-type 
S. cerevisiae RadSI led to efficient, ATP-dependent PC assem- 
bly (Figures SI and S2). 

Nonhomologous dsDNA Capture by Rad51 

RadSI/RecA recombinases must interrogate nonhomologous 
dsDNA while attempting to locate and align homologous se- 
quences. We mimicked this process by testing the ability of 
the RadSI PCs to interact with nonhomologous 70-base pair 
(bp) dsDNA oligonucleotides (Figure IB). To visualize dsDNA 
binding, we injected AttoSGS-labeled dsDNA into the sample 
chamber; for brevity we designated this substrate AttoSGS- 
DNAi.o- Following a brief incubation, unbound dsDNA was 
flushed away and the remaining molecules were visualized by 
TIRFM. These experiments revealed AttoSGS-DNAi.o bound to 
the PCs with no evident site preference within our resolution 
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limits (Figures 1C-1E), and most of the bound dsDNA (78.4%) 
exhibited single-step photo-bleaching (not shown). Controls 
with RPA-ssDNA (minus Rad51) confirmed that dsDNA capture 
was Rad51 -dependent (Figure S3A). In addition, the PCs 
rapidly disassembled when ATP was replaced with ADP (Fig- 
ure S2), and the bound dsDNA was also quickly released 
when reactions were chased with ADP, indicating that dsDNA 
retention required the continued presence of RadSI (Figures 
S3B-S3D). Kinetic measurements yielded a dissociation rate 
(koff) of 0.062 ± 0.001 min“^ for Atto565-DNAi o, corresponding 
to a lifetime of ~16 min (Figures IF and 1G). This was an 
extraordinarily stable interaction for a seemingly nonhomolo- 
gous dsDNA, and such long-lived infermediates would appear 
incompatible with an efficient search mechanism. We next 
sought to understand the physical basis for these long 
lifetimes. 

Substrate Length Does Not Impact dsDNA Retention 

If nonhomologous dsDNA capture primarily involved nonspecific 
electrostatic contacts with the phosphate backbone, then the 
lifetime of the bound intermediates should vary with dsDNA 
length. We tested this possibility with 35-bp and 18-bp dsDNA 
substrates. Surprisingly, the truncated substrates bound tightly 
to the PCs, although more substrate and longer incubation times 
were required for inifial engagement (Figure S4). We conclude 
that substrate length had a modest impact on initial association 
with the PC, but did not affect retention of the captured dsDNA, 
suggesting that the observed intermediates were not maintained 
primarily through nonspecific contacts with dsDNA phosphate 
backbone. 

Microhomology Contributes to dsDNA Capture 

We next asked whether sequence microhomology might 
contribute to dsDNA capture. Analysis of DNA^.o revealed 
many short tracts of microhomology complementary to se- 
quences scattered throughout the M13mp18 ssDNA, including 
12 regions with >8-nts of microhomology (Figures 2A and 2B). 
Previous reports suggested that £. coli RecA can pair DNA 
substrates perhaps as short as 8-nt in length (De Vlaminck 
et al., 2012; Hsieh et al., 1992; Xiao et al., 2006). Based on this 
knowledge, we designed a new substrate (Atto565-DNA2.o), 
which retained identical sequence composition as DNA^ o, but 
lacked microhomology >8-nt in length (Figures 2D and 2E). 
We readily detected capture of Atto565-DNAi o (Figure 2C), 
however, we were unable to detect stable capture of Atto565- 
DNA 2.0 under identical conditions (Figure 2F), despite the 
fact that this substrate contains numerous tracts of microhomol- 
ogy <7-nt in lengfh (Figure 2D). 

Stable dsDNA Capture Requires 8-nt Tracts of 
Microhomology 

Our results imply that dsDNA capture involves 8-nt or longer 
tracts of microhomology. This hypothesis predicts that a single 
8-nt tract of microhomology added to an otherwise nonhomol- 
ogous dsDNA should confer stable association with the PC. 
We tested this prediction with a series of substrates bearing 
precisely 8-nt of microhomology (Figure 3A). Remarkably, addi- 
tion of a single 8-nt tract of microhomology was sufficient to 



confer stable binding of a nonhomologous dsDNA to the PC, 
and similar results were obtained for 8-nt microhomology motifs 
at different locations (Figures 3A-3E). The binding site distribu- 
tions and the pairwise distance distributions of Atto565- 
dsDNA 2 .i revealed a 2.6 ± 0.2 pm periodicity, consistent with 
the expectation that the dsDNA was captured at a single 
position on M13mp18, and this conclusion was supported by 
analysis of a substrate targeted to an alternative location (Fig- 
ures 1D, IE, 3F, 3G, and S5). 

The requirement for microhomology suggested that captured 
intermediates were retained through Watson-Crick pairing. This 
hypothesis predicts that the binding lifetime should scale with 
melting temperature (T^), which was confirmed using sub- 
strates bearing 8-nt tracts of microhomology of varying AT-con- 
tent (Figure 3FI). Moreover, the change in free energy (AAG*) 
scaled with hydrogen bonding potential, with each hydrogen 
bond contributing ~0.14 k^T to the binding of the 8-nt motif. 
The modest contribution to overall stability for each hydrogen 
bond was consisfent with the requirement that the homology 
search be driven by thermal fluctuations and supports the 
notion that stretch-induced disruption of base stacking desta- 
bilizes the Watson-Crick base pairs relative to B-DNA (Chen 
et al., 2008). 

We also tested how microhomology length influenced dsDNA 
capture (Figure 31). We were unable to detect any stable bind- 
ing intermediates when the 8-nt tract of microhomology was 
decreased to 7-nt (Atto565-DNA2.e), in agreement with the 
conclusion that 8-nts of microhomology was necessary for sta- 
ble dsDNA capture (Figure 31 and see below). In contrast, 
increasing the 8-nt tract of microhomology to 9-nt reduced 
the dissociation rate, and additional length increases resulted 
in step-wise reductions in the dissociation rates in precise 3- 
nt increments (Figure 31 and see below). The microhomology 
requirement, the periodic binding patterns, and the influence 
of AT-content and microhomology length all suggested that 
the bound intermediates were maintained through Watson- 
Crick interactions. 

Transient dsDNA Sampling by Rad51 

RadSI did not stably capture dsDNA lacking 8-nt tracts of micro- 
homology, but it must be transiently sampling these molecules. 
Even microhomology-bearing dsDNA must in most instances 
be transiently sampled, because the vast majority of bimolecular 
encounters will occur at nonhomologous sites. Therefore, the 
70-bp substrates used in our assays offered the unique potential 
for exploring how RadSI samples and rejects dsDNA while 
searching for homology. We defected these transient intermedi- 
ates by visualizing reactions in real time at 60-ms resolution (Fig- 
ures 4A-4D). Remarkably, the survival probabilities of substrafes 
lacking >8-nt of microhomology (Atto565-DNA2.o) did nof decay 
exponentially, but rather scaled as a power-law, with 50% of the 
molecules dissociating within 0.54 s (Figure 4E), even though this 
substrate harbors numerous <7-nt tracts of microhomology 
(Figure 2D). Power-law dependence was also observed over 
short time regimes for a subsfrate bearing a single 8-nt tract of 
microhomology (Atto565-DNA2.i), whereas the lifetimes were 
limited by photo-bleaching at longer time scales, as expected 
(Figure 4E). 
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Figure 2. Stable Capture of Nonhomologous 
dsDNA 

(A) Analysis showing the total number (N) and the 
average number (±SD) for the given length of mi- 
crohomology within each occupied 70-nt window 
along M13mp18; additional details are presented as 
Supplemental Information. 

(B) Positions of microhomology (>8-nt) within 
Atto565-DNAi.o (color-coded bars indicate relative 
positions of microhomology within the dsDNA) and 
the schematic illustration showing the correspond- 
ing locations (indicated with color-coded arrow- 
heads) of the tracts of microhomology along a single 
unit length M13mp18 ssDNA substrate (lower 
panel). Illustrations are not to scale. 

(C) Kymograph showing binding of Atto565-DNAi.o 
to a single RadSI PC; 100-ms frames were 
collected at 5-s intervals. 

(D and E) Analysis (D) and schematic (E) of a re- 
designed 70-bp dsDNA (Atto565-DNA2.o) lacking 
8-nt tracts of microhomology. Error bars repre- 
sent SD. 

(F) Kymograph showing Atto565-DNA2.o incubated 
with a single PC; data were collected as in (C). 

See also Figure S6. 
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Figure 3. 8-nt Tracts of Microhomology Are 
Sufficient for dsDNA Capture 

(A) Substrates bearing a single 8-nt tract of micro- 
homology (highlighted in magenta) at different po- 
sitions within the 70-bp dsDNA. 

(B) Average number of Atto565-dsDNA bound per 
PC. N corresponds to the number of PCs counted. 
Error bars represent SD. 

(C) Kymograph showing an example of Atto565- 
DNA 2.1 dissociating from a PC. 

(D and E) Survival probability plots (D) and dissoci- 
ation rates (E) for each substrate. 

(F and G) Binding distribution (F) and pairwise dis- 
tance distribution (G) for Atto565-DNA2.i. 

(H) Design, survival probability plots, and dissocia- 
tion rates for DNA substrates bearing a single 8-nt 
tract of microhomology with varying AT-content. 

(I) Design, survival probability plots, and dissociation 
rates for substrates bearing 8- to 15-nts of micro- 
homology; sequences and survival probability 
curves for the 10-nt, 11 -nt, 13-nt, and 14-nt sub- 
strates are omitted for clarity. There was no detect- 
able binding activity for Atto565-DNA2.6 in these 
assays. In (D)-(l), N corresponds to the number 
of Atto565-DNA molecules measured. Error bars 
represent SD. 

See also Figure S5. 
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Figure 4. Transient Sampling dsDNA Lack- 
ing Microhomology 

(A) Strategy for visualizing dsDNA sampling at 
60-ms resolution. 

(B-D) Kymographs showing (B) Atto565-DNAi o in 
the absence of the PC (control surface), and 
RadSt PCs sampling (C) Atto565-DNA2.o or (D) 
Atto565-DNA2.i. 

(E) Log-log plot revealing the power-law depen- 
dence of the transient search intermediates. 
Dashed lines represent a single exponential fit 
to the photo-bleaching data, power-law fits for 
Atto565-DNA2.o and Atto565-DNA2.6, and combi- 
nation of a power-law and single exponential fit for 
Atto565-DNA2.i. 

(F) Energy landscape describing dsDNA sampling 
and strand invasion by RadSt . The heat map and 
open circles (±SD) represent calculated values for 
normalized occupation probability and AAG* 
values based on experimental data, respectively. 
The black line is a representation of the landscape 
and the heights of the energy barriers between 
states is for illustrative purposes only. Additional 
details are presented in the main text and Sup- 
plemental Information. 

(G) Distribution of kinetic rates for dsDNA sampling 
and capture by RadSt. Solid lines represent 
experimental data and the dashed line reflects 
intermediates that are sampled too rapidly to be 
detected. 

See also Figure S6. 



We next conducted real-time measurements with Atto565- 
DNA 2 . 6 , which differs from Atto565-DNA2.i by just a single nucle- 
otide (Figure 3I; Supplemental Information); as indicated above, 
this single nucleotide change reduces the 8-nt tract of microho- 
mology to 7-nt and abolishes stable capture of this substrate by 
RadSI. Instead, Atto565-DNA2.6 exhibits power-law distributed 
dissociation kinetics with 50% of the molecules dissociating 
within 0.82 s (Figure 4E). These findings indicate that all the 
dsDNA substrates were initially sampled through the same 



pathway, as revealed by its characteristic 
power-law dependence, but only sub- 
strates bearing 8-nts of microhomology 
transitioned into the long-lived state. 

A crucial implication of this power-law 
behavior is that the transient sampling 
events cannot be ascribed to a single 
conformational state that can be as- 
signed a unique dissociation rate con- 
stant, but rather reflects the existence 
of a highly diverse ensemble of states 
with a correspondingly broad distribution 
of dissociation rates (Austin et al., 1975; 
Frauenfelder et al., 1991). The physical 
basis for this power-law dependence is 
readily understood given the vast number 
of potential intermediates. If one assumes 
recognition involving 8-nt sequence mo- 
tifs, then a 70-bp dsDNA can be mis- 
aligned with a total of 453,652 distinct sites on M13mp18, 
each of which can give rise to energetically distinct states based 
on differences in sequence composition. Power-law distributed 
dissociation kinetics are also consistent with recent molecular 
dynamics simulations, which suggest a large number of interme- 
diates as RecA probes sequences for homology (M. Prentiss, 
personal communication). These considerations highlight the 
tremendous challenge faced during the homology search, even 
within our simplified experimental system. 
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Energy Landscape for dsDNA Sampling and Strand 
Invasion 

Our data provide a free energy landscape describing dsDNA 
sampling and strand invasion by Rad51 (Figure 4F; Supplemental 
Information). The initial search process is characterized by tran- 
sient intermediates that encompass a broad distribution of ener- 
getic states, which could reflect thousands of distinct complexes 
as RadSI interrogates different sequences for homology (Figures 
F and 4G). Recognition of an 8-nt tract of microhomology results 
in an ~8.2 ksT drop in free energy (AAG*) and gives rise to a >4 
order-of-magnitude decrease in dissociation kinetics, providing 
a robust length-based mechanism for kinetically discriminating 
against sequences that are unlikely to be fully homologous (Fig- 
ures 4F and 4G). This length-based microhomology recognition 
event is the single largest change in the energy landscape 
and most likely reflects a conformational transition within the 
RadSI -ssDNA-dsDNA ternary complex— the exact nature of 
which remains to be explored. The finding that recognition of an 
8-nt tract (as opposed to either 6- or 9-nt) coincided with the 
largest drop in free energy was not anticipated given that ssDNA 
within the PC is organized into base triplets (Chen et al., 2008). 
Following microhomology capture, RadSI can probe the flanking 
the DNA for additional homology while attempting strand inva- 
sion. Pairing with a ninth nt results in an additional ~0.4 ksT 
reduction in free energy, revealing that incorporation of the ninth 
nt enabled more stable engagement of the third base triplet. All 
subsequent reductions in free energy occurred in precise 3-nt in- 
crements, suggesting that the ssDNA bound by RadSI was orga- 
nized into base triplets, as observed for E. coli RecA (Chen et al., 
2008), and the quantized reductions in binding energy were the 
functional consequence of this triplet organization. Together, 
these findings also indicate that capture of the first 8-nt tract of 
microhomology is mechanistically distinct from the subsequent 
reactions involved in strand invasion, suggesting that recognition 
of the ninth nt demarks the beginning of actual strand exchange, 
allowing subsequent reactions to take place in 3-nt steps. 

Sliding or Intersegmental Transfer Do Not Contribute to 
Microhomology Capture 

Prior smFRET measurements suggested that ID sliding might 
contribute to DNA alignment by RecA over short distances 
(Ragunathan et al., 2012). Flowever, in agreement with prior 
biochemical studies (Adzuma, 1998), our data revealed no evi- 
dence of ID sliding for RadSI, although we do not rule out 
the possibility that sliding might take place over short distances 
(<270-nm). Other studies have shown that sequence alignment 
by RecA involves intersegmental transfer (Forget and Kowalczy- 
kowski, 2012). We found no evidence that the 70-bp dsDNA mol- 
ecules moved by intersegmental transfer (Figure S6); however, 
these results do not argue against intersegmental transfer as a 
crucial component of the RadSI homology search (see below), 
rather, our findings are as anticipated for a search entity 
engaging a single unit-length binding element. 

Facilitated Exchange Promotes Turnover of dsDNA 
Bound to the Presynaptic Complex 

Stand invasion in S. cerevisiae can be detected within ~10- 
60 min of DSB formation, so the search for homology must 



be completed within this time window. Flowever, 8-nts is insuffi- 
cient to define a sequence as statistically unique within the 
S. cerevisiae genome, and it is difficult to envision how recombi- 
nation could be executed on a relevant timescale if the PC 
became kinetically trapped every time it encountered a >8-nt 
tract of microhomology. This implies the existence of unknown 
mechanisms for disrupting these intermediates. 

One possibility is that specific enzymes might disrupt 
intermediates involving short microhomology motifs; there are 
numerous helicases/translocases with the potential to fulfill 
such a role (e.g., Mphi, Srs2, Sgsl, Rdh54, and/or Rad54) 
(Fleyer et al., 2010; Renkawitz et al., 2014; San Filippo et al., 
2008). We do not exclude the possibility that these or other pro- 
teins may contribute to the homology search, perhaps by pro- 
moting the turnover of RadSI bound to incorrect 8-nt tracts of 
microhomology— future work will be necessary to test this hy- 
pothesis. Flowever, RadSI, like many other RadSI/RecA family 
members, can catalyze strand exchange in vitro with no need 
for these accessory factors despite the potential for sequence 
misalignment at any of the hundreds of 8-nt microhomology 
motifs present in the plasmids typically used for these assays, 
underscoring that the ability to search for homology is an intrinsic 
property of RadSI /RecA proteins. Therefore we asked whether 
a more fundamental mechanism(s) might promote dissolution 
of microhomology-bound intermediates. It has recently been 
recognized that facilitated exchange can contribute to disruption 
of protein-nucleic acid interactions (Gibb et al., 2014a; Graham 
et al., 2011; Sing et al., 2014) and may be a general but under- 
appreciated phenomenon that influences macromolecular 
interactions under crowded physiological settings. Facilitated 
exchange reflects the existence of microscopically dissociated 
intermediates, which only undergo macroscopic dissociation 
when competing interactions arise from other molecules in the 
local environment. These concepts are readily extended to reac- 
tions involving the PC. 

We considered the possibility that dissolution of intermediates 
arising from captured microhomology might be promoted by 
facilitated exchange with other dsDNA molecules. The hypothe- 
sis that DNA might disrupt search intermediates is intriguing 
given the high concentration of DNA within the nucleus and the 
potential ubiquity of such a mechanism. To test this hypothesis, 
we asked whether dsDNA bound to the PCs was released more 
rapidly into free solution when challenged with free competitor 
dsDNA. For this, Atto565-DNAi.o was pre-bound to the PCs, 
and the reactions were chased with unlabeled competitor 
(DNA 1 . 0 ; Figure 5A). Remarkably, the competitor chase acceler- 
ated macroscopic dissociation of Atto565-DNAi.o by up to 
~3-fold (Figures 5B-5E). We conclude that free dsDNA can 
accelerate turnover of dsDNA bound to the PCs consistent 
with a mechanism involving facilitated exchange. 

Sequence and Length Requirements for Facilitated 
Exchange 

PCs capture dsDNA through 8-nt tracts of microhomology, 
implying that facilitated exchange might involve overlapping 
tracts of microhomology. If correct, then facilitated exchange 
should only occur with competitor substrates bearing identical 
8-nt tracts of microhomology. Indeed, reactions with two 
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Figure 5. Facilitated Exchange of Captured 
Intermediates 

(A) Strategy for quantifying dsDNA dissociation after 
injection of unlabeled competing DNA. 

(B) Kymographs showing the dissociation of Atto565- 
DNA-i.o tfoni the Rad51 PC in the absence (upper 
panel) and presence (lower panel) of unlabeled 
competitor DNAi.o- 

(C and D) Dwell time analysis of dissociation kinetics 

(C) and dissociation rates (D) for Atto565-DNAi.o 
when chased with varying concentrations of dark 
DNA-i.o- The dissociation rates as a function of dark 
competitor are fit to a Hill-type curve with an inter- 
cept conveying the reaction in the absence of 
competitor. N corresponds to the number of 
Atto565-DNA molecules measured. Error bars 
represent SD. 

(E and F) Dissociation rates for Atto565-DNAi.o (E) 
and Atto565-DNA2.i (F) when challenged with 
different competitor substrates (1 |.iM each), as indi- 
cated; like colors correspond to competitors bearing 
overlapping tracts of microhomology, competitors 
lacking overlapping microhomology are shown in 
black. N corresponds to the number of Atto565-DNA 
molecules measured. Error bars represent SD. 

(G and H) Schematic (G) and corresponding (H) 
data for substrates used to test the influence of 
microhomology length and alignment on facilitated 
exchange. 
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different Atto565-labeled substrates and series of competitors 
confirmed that faciiitated exchange required overiapping tracts 
of microhomology (Figures 5E and 5F), and exchange was abol- 
ished if the competing microhomology was shifted by even a sin- 
gle nucleotide in either direction (not shown). 

We next tested how facilitated exchange was influenced by 
microhomology length. The increased stability of substrates 
bearing longer tracts of microhomology (see Figure 31) was re- 
flected in the finding that shorter tracts of microhomology were 
more readily exchanged with longer tracts, whereas longer tracts 
of microhomology were more resistant to exchange with shorter 
tracts (Figures 5G and 5Ft). Moreover, a 15-nt tract of microho- 
mology was sufficient to render a bound substrate completely 
resistant to facilitated exchange. Together, these results demon- 
strate that facilitated exchange requires overlapping microho- 
mology, indicate that once the PC has engaged a particular 
dsDNA it ignores substrates lacking overlapping microhomol- 
ogy, and suggest that facilitated exchange can lead to preferen- 
tial association with longer microhomology motifs. These results 
also imply the existence of a length-based threshold of ~15-nts 
as perhaps demarking the commitment to strand exchange; 
reversibility at this stage of the reaction would likely require 
accessory proteins dedicated to dissolution of aberrant strand 
exchange intermediates (Fleyer et al., 2010; San Filippo et al., 
2008). 

In addition to facilitated exchange, Atto565-labeled substrates 
bearing an 8-nt microhomology motif were also displaced from 
the PC when challenged with a fully homologous 70-bp sub- 
strate (DNA3 o), but only if the homologous substrate overlapped 
in sequence with the bound dsDNA (Figures 5E and 5F). This 
finding implies that the initiation of strand exchange with a ho- 
mologous substrate anywhere along the PC would be sufficient 
to drive disruption of captured 8-nt tracts of microhomology 
located at adjacent positions along the PC, ensuring that 
stand invasion could progress unimpeded once homology was 
correctly identified. 

Joint Molecules Made with Fully Homologous dsDNA 
Resist Disruption 

The results presented above lead to four predictions for reac- 
tions involving homologous substrates: (1) initial sampling of 
the homologous substrate should exhibit power-law depen- 
dence over short time regimes, (2) a homologous substrate 
should bind to all locations bearing >8-nt of microhomology, 
(3) a captured homologous substrate should exhibit two cate- 
gories of lifetimes corresponding to those molecules bound to 
microhomology motifs and those that are bound to the full region 
of homology, and (4) the captured intermediates should be differ- 
entially affected when chased with competitor dsDNA. We 
tested these predictions using a homologous 70-bp substrate 
(AttoSeS-DNAs.o); analysis of this substrate revealed >8-nt 
tracts of microhomology at 1 9 distinct sites on Ml 3mp1 8 ssDNA 
(Figure 6A). As anticipated, the initial sampling intermediates 
exhibited characteristic power-law behavior, reflecting the exis- 
tence of a diverse ensemble of transient complexes (Figures 6B 
and 6C). Once captured, lifetime analysis of the bound dsDNA 
revealed the existence of two spatially distinct populations: 
shorter-lived intermediates and longer-lived intermediates that 



displayed a periodic binding distribution as expected for the 
unique 70-nt region of homology (Figures 60 and 6D). As pre- 
dicted, only the shorter-lived intermediates were disrupted 
when challenged with competing dsDNA, whereas the longer- 
lived complexes were resistant to facilitated exchange (Figures 
6E and 6F). We conclude that Rad51 utilizes a length-based mi- 
crohomology recognition mechanism even when presented with 
a fully homologous substrate and that products generated 
through strand invasion of the homologous substrate were highly 
stable. 

Model for DNA Sequence Alignment during HR 

Our results are unified in a model for how RadSI aligns DNA se- 
quences during FIR (Figure 7A). For clarity, Figure 7A depicts a 
single interacting unit; we anticipate multiple unit-length interac- 
tions will occur throughout the PC, as expected for interseg- 
mental transfer (Forget and Kowaiczykowski, 201 2). We propose 
that RadSI samples dsDNA in 8-nt increments and quickly re- 
jects any sequences lacking 8-nt tracts of contiguous microho- 
mology. This stage of the reaction is characterized by a complex 
energetic landscape as RadSI quickly explores a vast amount of 
sequence space. The presence of an 8-nt tract of microhomol- 
ogy allows dsDNA to be captured through Watson-Crick pairing, 
enabling RadSI to probe the flanking duplex for additional 
complementarity while attempting more extensive strand ex- 
change. If pairing with a ninth nt is successful, then the resulting 
intermediates are rendered more stable by virtue of more exten- 
sive Watson-Crick pairing in precise 3-nt increments, eventually 
crossing a threshold (~1S-nt) beyond which they are much less 
susceptible to either spontaneous dissociation or facilitated 
exchange. In contrast, if further strand invasion fails, then any 
search intermediates bound to incorrect 8-nt tracts of microho- 
mology can be disrupted by either spontaneous dissociation or 
facilitated exchange, or successful capture of full homology any- 
where along the length of the PC will also disrupt any existing 
search intermediates allowing unimpeded strand exchange. 

This model hints at a deeper understanding for how £. coli 
RecA might search for homology— RecA can capture as little 
as 8-nt of homology (FIsieh et al., 1992), and re-evaluation of 
the 1,762-nt ssDNA and 48,502-bp dsDNA sequences used to 
substantiate the RecA intersegmental transfer mechanism 
reveals a total of 2,089 tracts of 8-nt microhomology (Forget 
and Kowaiczykowski, 2012). We suggest that RecA may estab- 
lish numerous points of contact with dsDNA through these short 
tracts of microhomology. 

A Conserved Search Mechanism for the RadSI /RecA 
Recombinases 

The salient feature of our model for the homology search is that 
it minimizes nonproductive interactions with short (< 7-nt) 
dsDNA sequences that have little chance of being the homolo- 
gous target. This assertion is based upon two key features of 
S. cerevisiae RadSI: (1) rapid sampling and rejection of dsDNA 
lacking microhomology motifs through a mechanism character- 
ized by its distinctive power-law dependence, and (2) length- 
specific kinetic selection of microhomology tracts (Figure 7A). 
We next asked whether human RadSI (hRadSI), S. cerevisiae 
Dmcl , and E. coii RecA behaved similarly. Remarkably, all three 
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Figure 6. Sampling and Capture of a Fully 
Homologous Substrate 

(A) Microhomology analysis and schematic of the 
70-bp homologous dsDNA substrate (Atto565- 
DNA 30 ) highlighting the > 8 -nt tracts of micro- 
homology complementary to the M13mp18 ssDNA 
substrate. 

(B) Power-law dependence of search intermediates 
observed with DNA 3 q. The dashed black line shows 
combination of a power-law and single exponential 
fit (to account for photo-bleaching) to the data, and 
the red line presents an exponential fit to the data for 
comparison. 

(C) Observed binding distribution of AttoSBS-DNAs o 
at time zero. 

(D) Lifetime distribution of Atto565-DNA3 0 in the 
absence of competitor dsDNA challenge. 

(E and F) Lifetime distribution of AttoSBS-DNAs 0 
when challenged with either (E) 1 riM DNA 3 0 or (E) 
1 nM salmon sperm DNA. N corresponds to the 
number of Atto5B5-DNA molecules measured. Error 
bars represent SEM. 



three proteins preferentially captured sub- 
strates harboring 8-nts of microhomology 
(Figure 7C). These results revealed that 
recognition of an 8-nt microhomology 
motif coincided with ^6.1, ^6.5, and 
~6.2 /cbT (AAG*) reductions in the free en- 
ergy landscapes for hRadSI, ScDmcl, 
and RecA, respectively, reflecting the 
drastic differences in affinity for dsDNA 
with and without an 8-nt tract of microho- 
mology. These findings suggest that the 
ability to interrogate dsDNA through a 
mechanism involving length-specific mi- 
crohomology recognition emerged early 
in the evolutionary history of the RAD51/ 
recA gene family. 

DISCUSSION 

The genetic transactions that take place 
during FIR are governed by the physico- 
chemical properties of the macromole- 
cules that promote these reactions, and 
a full appreciation for the elegance of 
DNA recombination requires a detailed 
understanding of the underlying mecha- 
nistic principles. Our work suggests that 
length-specific kinetic selection of 8-nt mi- 
crohomology motifs underlies the intrinsic 
ability of the Rad51/RecA recombinases 
to efficiently align homologous sequences 
and mechanistically distinguishes this pro- 



proteins displayed power-law behavior while transiently sam- 
pling dsDNA that lacked 8-nt microhomology motifs, with 50% 
of the sampling events occurring within 3.5 s, 1.1 s, and 2.5 s 
for hRad51, ScDmcl, and RecA, respectively (Figure 7B). All 



cess from the 3-nt steps that take place during strand exchange. 
The use of microhomology motifs as recognition elements has 
crucial implications for understanding how DNA sequences are 
aligned during HR. 
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Figure 7. A Conserved Homology Search 
Mechanism 

(A) Modei depicting a homology search mecha- 
nism involving rapid sampling and rejection of DNA 
lacking microhomology, followed by eventual 
capture of an 8-nt tract of microhomology and 
facilitated exchange allowing for an iterative 
search through sequence space. Additional de- 
tails are presented in the main text. 

(B and C) Plots showing power-law behavior dur- 
ing dsDNA sampling (B) and microhomology- 
dependent binding (C) for E. coli RecA, hRadSf , 
and S. cerevisiae Dmcl, Data presented for 
ScRadSI are reproduced from Figures 3B and 4E 
for comparison. The plus and minus 8-nt motif 
designations in (C) correspond to Atto565-DNA2.i 
and Atto565-DNA2.o, respectively, and N corre- 
sponds to the number of PCs counted. Error bars 
represent SD. 

(D) Surface plot showing how search complexity 
varies with PC length and the length of micro- 
homology necessary for dsDNA interrogation. 

(E) Variation in search complexity for search 
models employing different lengths of micro- 
homology, as indicated. 

(E) Relationship between search complexity and 
PC length for recognition involving 8-nt of micro- 
homology. The green shaded region encom- 
passes length estimates for S. cerevisiae PCs. 

(G) Praction of the S. cerevisiae genome that can 
be kinetically ignored when employing a length- 
dependent search mechanism based on recogni- 
tion of 8-nt motifs. 

See also Figure S7. 



align two homologous sequences can 
be quantitatively described as search 
complexity, which reflects the number of 
sites a searching entity must visit within 
the genome while attempting to locate a 
unique sequence (Figure 7A). A full treat- 
ment of search complexity is presented 
as Supplemental Information; here, we 
highlight key concepts and their rele- 
vance to FIR. In brief, search complexity 
can be defined as: 

complexity (bp •genome^'') = ^{o-n+^) 



x(/-n-f1 



Microhomology Recognition Minimizes Search 
Complexity 

The advantages of a length-based microhomology recognition 
can be illustrated by considering its influence on the amount of 
sequence space that must be interrogated during the homology 
search. The information that must be processed in order to 



where n is the length of microhomology 
used during the search, / is the length of 
the genome, and o is PC length. Any value 
for search complexity >1.0 bp x genome^^ indicates that the 
PC will on average sample more that a genome equivalent’s 
worth of sites before locating homology; e.g., for an organism 
with a 1 X 10® bp genome, a search complexity of 1 bp x ge- 
nome^^ indicates that the PC would on average need to sample 
the equivalent of 100% of the genome (i.e., 1x10® bp) before 
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locating homology. Values <1 .0 bp x genome^^ reflect a search 
that is accelerated relative to genome size; e.g., search 
complexity of 0.1 bp x genome^^ indicates that only one-tenth 
of the genome would need to be sampled to locate homology. 

The benefits of microhomology recognition can now be 
explored by considering the impact on search complexity (Fig- 
ures 7D-7G). The most important revelation from this analysis 
is that search complexity decreases exponentially with the min- 
imal length of microhomology necessary for dsDNA recognition. 
The source of this exponential dependence is evident given that 
for any genome short sequences will always have many exact 
matches, while longer sequences will always have fewer exact 
matches. For example, any defined 3-nt motif occurs on average 
once every 639-bp, and there would be ~377,229 such se- 
quences in the S. cerevisiae genome (Figure S7A). In contrast, 
8-nt motifs will on average occur just once every 65,536-bp, 
and there would only be ~762 identical 8-nt motifs in the yeast 
genome (corresponding to an in vivo concentration of ^^0.3 |iM 
for any given 8-mer). As a consequence, a search utilizing an 
8-nt motif would only need to interrogate just ~0.01 % of the 
genome to locate the homologous target, and the vast majority 
of the genome could be kinetically ignored. Indeed, a homology 
search involving length-specific recognition of 8-nt motifs, while 
kinetically minimizing interactions with shorter sequence motifs, 
would effectively eliminate >99.9% of the genome for species 
ranging from E. coli to humans. 

Genetic and physical measures of the ssDNA overhangs 
generated during DSB repair suggest that S. cerevisiae PCs 
are 00-4,000 nt in length (Chung et al., 201 0; Jinks-Robertson 
et al., 1993), and it is informative to consider how search 
complexity varies within this length regime. For a search utilizing 
8-nt tracts of microhomology, a 100-nt PC would only need 
to process information content corresponding to 1/100*'^ of the 
genome (Figure 7F, inset), a 4,000-nt PC would only need to 
sample one-half of the genome (Figure 7F), and search 
complexity would not enter the over searched regime until PC 
length exceeded ~8,000-nt (Figure 7F). In contrast, if one as- 
sumes a model without microhomology recognition (i.e., n = 1), 
then PCs ranging from 1 00-4,000-nt in length might have to pro- 
cess information equivalent to 2, 500%-1 00,000% of the 
genome. These considerations illustrate how simply subdividing 
the search into length-based microhomology recognition ele- 
ments can drastically reduce the time necessary to align homol- 
ogous sequences. 

Physiological Implications for HR and DSB Repair 

Our reductionist treatment of search complexity excludes poten- 
tial effects of accessory factors, chromatin structural proteins, 
chromosome organization, etc. Interpretation of our results 
within the context of these physiological realities leads to several 
important insights and predictions. First, end resection, PC as- 
sembly, and the homology search are often presented as distinct 
stages of DSB repair. Fiowever, there is no reason to believe that 
these reactions are completely uncoupled, and the relative 
timing of these events dictates how much information must be 
processed during the homology search. Our results predict a 
substantial benefit to beginning the homology search as soon 
as possible after initiating DSB resection (Figure 7G). 



Second, for mechanisms involving length-dependent 
microhomology recognition, the fractional reduction in search 
complexity is the same regardless of genome size. Although 
longer recognition motifs offer the potential for further reductions 
in search complexity, this would compromise reversibility 
because of the greater enthalpic penalty incurred for disruption 
of a larger binding surface, which could ultimately lead to 
misalignment of DNA sequences trapped in local minima. More- 
over, assuming a randomized nucleotide distribution, the length 
required to statistically define a given sequence as unique does 
not vary drastically across species. For instance, average 
lengths of just ~12, ~13, and ~17 nucleotides are sufficient 
to uniquely define most sequences within the E. coii, 
S. cerevisiae, and human genomes, respectively (Figure STB). 
These considerations imply that there may be little or no evolu- 
tionary pressure to utilize longer tracts of microhomology to 
compensate for variations in genome size. Notably, real ge- 
nomes contain repetitive sequences and other regions of low 
sequence complexity (e.g., rDNA and tRNA genes, transposons, 
centromeres, telomeres, etc.), and such regions would require 
longer sequences to define “uniqueness,” or else may suffer 
from a greater potential for misalignment during HR. Interest- 
ingly, recombination within these regions is often suppressed 
and/or otherwise tightly regulated (Eckert-Boulet and Lisby, 
2009, 2010; Pan et al., 201 1 ; Sasaki et al., 2010), perhaps reflect- 
ing in part the unique challenges faced by the recombination 
machinery in these regions of low sequence complexity. 

Third, PC organization affects the amount of information that 
must be processed during the homology search. The preceding 
discussion assumes a contiguous PC consisting of all possible 
overlapping 8-nt units (Figure S7C). However, search complexity 
declines by an entire order of magnitude if the PC is segregated 
into non-overlapping 8-nt sections, and intermediate subdivi- 
sions are similarly beneficial (Figures S7C-S7E). It is not known 
whether PCs in vivo are comprised of uninterrupted Rad51/ 
RecA filaments, or whether they contain protein-free gaps and/ 
or other physical discontinuities (e.g., other HR proteins). Our re- 
sults suggest some proteins could promote HR by segregating 
Rad51/RecA filaments into non-overlapping functional units. 

Fourth, once the PC has engaged a particular 8-nt tract of 
microhomology it can undergo exchange with other regions of 
dsDNA bearing the same microhomology, but resists exchange 
with unrelated sequences. Moreover, shorter tracts of microho- 
mology are more readily exchanged with longer tracts, reflecting 
the higher stability of intermediates held together by longer tracts 
of Watson-Crick pairing. Preferential exchange with longer tracts 
of microhomology may yield a hierarchy of increasingly stable in- 
termediates, which might in turn funnel the PC through progres- 
sively smaller pools of sequences leading to the homologous 
target (Figure 7G). 

Fifth, compartmentalization of the search through either 
spatial organization or steric occlusion will decrease search 
complexity linearly with respect to the amount of sequence 
accessible for interrogation. Benefits are readily envisaged if 
homologous chromosomes are physically juxtaposed, as antic- 
ipated for sister chromatids immediately following DNA replica- 
tion, and accumulating evidence suggests that homologous 
sequences also have a greater probability of being juxtaposed 
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at other points in the ceii cycle (Barzel and Kupiec, 2008; Glady- 
shev and Kleckner, 2014; Weiner and Kleckner, 1994). Similarly, 
restricting search intermediates to the linker DNA between nu- 
cleosomes could reduce search complexity by ~75% based 
on nucleosome occupancy of the S. cerevisiae genome. 

Reduction of Dimensionality versus Reduction of Search 
Complexity 

Target search studies have historically centered upon whether 
the path-to-target involves 3D diffusion (i.e., “jumping”), or path- 
ways that accelerate the search through reduction of dimension- 
ality (i.e., facilitated diffusion), such as 1 D diffusion (i.e., sliding or 
“hopping”), or intersegmental transfer (von Hippel and Berg, 
1989). Our results now highlight reduction of search complexity 
as an efficient means of accelerating target searches. RadSI ac- 
complishes this by first looking for a small portion of its target 
before testing the flanking DNA for homology. The difference in 
stability for substrates bearing <7-nt versus >8-nt of microho- 
mology minimizes off-target interactions, ensuring that RadSI 
spends most of the search interrogating sequences that already 
have a high probability of being a homologous target (Figure 7G). 
This mechanism is strikingly similar to the strategy employed by 
the Cas9 CRIPSR RNA-guided endonuclease (Sternberg et al., 
2014). Cas9 search intermediates are restricted to a trinucleotide 
sequence called the protospacer adjacent motif (PAM). Cas9 
kinetically ignores non-PAM sequences, but binds transiently to 
PAMs (5'-NGG-3'), allowing it to test the flanking dsDNA for 
complementarity to the guide RNA. This simple mechanism al- 
lows Cas9 to kinetically ignore ~90% of the 7 phage genome, 
ensuring that the search is focused on sequences that have a 
high probability of being the correct target (Sternberg et al., 
2014). RadSI and Cas9 are unrelated, yet they share extraordi- 
narily similar search strategies— the only difference is that Cas9 
looks for a fixed 3-nt motif, whereas RadSI looks for variable 
8-nt motifs. We suggest that similar mechanisms involving the 
initial recognition of short sequence motifs representing just a 
small portion of a complete binding site may be a broadly utilized 
strategy for DNA-binding proteins to minimize search complexity 
while searching within genomes for particular targets. 

CONCLUSIONS 

Our work supports a model in which short tracts of microhomol- 
ogy represent the fundamental functional units of dsDNA 
recognition during HR, yielding insights into how RadSI /RecA 
recombinases align homologous sequences. The emergent 
concepts may be broadly applicable. 

EXPERIMENTAL PROCEDURES 

S. cerevisiae RPA-eGFP and S. cerevisiae Rad51 were expressed and purified 
as previously described (Gibb et al., 2014a). Single-stranded DNA substrates 
were prepared by rolling circle replication using 4)29 DNA polymerase and a 5' 
biotinylated primer annealed to a circular M13mp18 ssDNA template (Gibb 
et al., 2012). Fused silica slides were patterned by e-beam lithography and lipid 
bilayers were prepared with 91.5% DOPC, 0.5% biotinylated-DPPE, and 8% 
mPEG 550-DOPE (Avanti Polar Lipids) (Greene et al., 2010). Experiments were 
performed using a prism-type TIRFM equipped with 488-nm and 561 -nm lasers 
(Coherent) and two iXon EMCCDs (Andor Technology). Videos were collected 



with NIS Elements AR (Nikon), data were quantitated using NIH Image J, and 
all survival probability curves were corrected for photo-bleaching. 

All ScRad51 experiments were conducted at 30°C in HR buffer containing 
30 mM Tris-acetate (pH 7.5), 20 mM Mg-acetate, 50 mM KCI, 1 mM DTT, 
0.2 mg/ml BSA, plus 2.5 mM ATP (Sugiyama et al., 1997). Presynaptic com- 
plexes were assembled by incubating RPA-eGFP bound ssDNA curtains 
with 2 |.iM ScRad51 in HR buffer for 15 min at 30°C. Free ScRad51 was then 
flushed from the sample chamber using HR buffer plus 2.5 mM ATP. Presyn- 
aptic complex assembly was confirmed by visual inspection of the ssDNA 
before, during, and after the ScRad51 injection. 

DNA binding was measured by injecting Atto565-dsDNA (10 nM) into the 
sample chambers. Reactions were then incubated for 10 min in the absence 
of buffer flow, and free dsDNA was quickly flushed away. For reactions con- 
taining competitor dsDNA, the competitor was included at the indicated con- 
centration in the buffer used to flush the sample chamber. Data were obtained 
by acquiring single 1 00-ms frames at either 20-s, 30-s, 40-s, or 60-s intervals, 
and the laser was shuttered between each acquired image to minimize 
photo-bleaching. Kymographs were generated from the resulting videos. 
The average number of bound dsDNA molecules, binding distributions and 
survival probabilities were all determined from analysis of the kymographs. 

Transient dsDNA sampling at higher temporal resolution was measured by 
injecting Atto565-tagged dsDNA substrate (1 0 nM), buffer flow was then termi- 
nated and data were acquired using a 60-ms exposure time and continuous 
laser illumination in the absence of shuttering. The resulting data was analyzed 
based on the corresponding kymographs, as previously described (Sternberg 
et al., 2014). 

Reaction conditions for £ coli RecA, S. cerevisiae Dmcl , and human Rad51 
are presented in the Extended Experimental Procedures. Search complexity 
calculations presented in Figure 7 are described in the Extended Experimental 
Procedures. 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures and 
seven figures and can be found with this article online at http://dx.doi.org/ 
10.1016/j.cell.2015.01.029. 
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In Brief 

Programmed frameshifting allows the 
translation of alternative protein products 
from a singie transcript. To achieve this, 
E. coli ribosomes undergo severai 
translocation excursions to shift reading 
frames and access a range of codon 
positions. 
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Highlights 

• Ribosome translocation excursions occur before adopting a 
frame to resume translation 

• Ribosomes achieve -1 , -4, and +2 nt slips to enter the -1 
frame on the mRNA 

• Ribosomes frameshift not from one specific codon but from 
a range of codon positions 

• The presence of incomplete translation products 
underscores fidelity maintenance 
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SUMMARY 

Programmed ribosomal frameshifting produces alter- 
native proteins from a single transcript. -1 frameshift- 
ing occurs on Escherichia coli’s dnaX mRNA contain- 
ing a slippery sequence AAAAAAG and peripheral 
mRNA structural barriers. Here, we reveal hidden as- 
pects of the frameshifting process, including its exact 
location on the mRNA and its timing within the trans- 
lation cycle. Mass spectrometry of translated prod- 
ucts shows that ribosomes enter the -1 frame from 
not one specific codon but various codons along the 
slippery sequence and slip by not just -1 but also 
-4 or +2 nucleotides. Single-ribosome translation tra- 
jectories detect distinctive codon-scale fluctuations 
in ribosome-mRNA displacement across the slippery 
sequence, representing multiple ribosomal trans- 
location attempts during frameshifting. Flanking 
mRNA structural barriers mechanically stimulate the 
ribosome to undergo back-and-forth translocation 
excursions, broadly exploring reading frames. Both 
experiments reveal aborted translation around 
mutant slippery sequences, indicating that subse- 
quent fidelity checks on newly adopted codon posi- 
tion base pairings lead to either resumed translation 
or early termination. 

INTRODUCTION 

During translation, the ribosome successively reads three nucle- 
otides— one codon— at a time to produce the protein encoded in 
the messenger RNA (mRNA). This process involves base pairing 
each codon with the anticodon of the cognate aminoacylated 
transfer RNA (aa-tRNA). As a result, each mRNA sequence spec- 
ifies one unique polypeptide translated from start to stop codon 
in the so-called 0 frame. Crystals of such ribosomes in the de- 
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coding mode— complexed with mRNA and two tRNAs— have 
provided a detailed structural basis for reading frame mainte- 
nance (Jenner et al., 2010; Schmeing and Ramakrishnan, 
2009; Selmer et al., 2006; Stahl et al., 2002; Yusupov et al., 
2001). Specifically, the mRNA wrapping around the neck of 
the ribosome small subunit is kinked into segments, with the 
three consecutive codons positioned in the exit- (E), peptidyl- 
(P), and aminoacyl- (A) codon:anticodon binding sites. The anti- 
codon stem loop and the aminoacyl-acceptor end of each tRNA 
are accommodated on the small (30S) and large (SOS) subunits, 
respectively, corresponding to the classical-state P/P- and /VA- 
tRNAs (capital letters denote sites on 30S/50S). After peptidyl- 
transfer between the tRNAs, the ribosome facilitates the two 
tRNAs to adopt hybrid states in which the anticodons remain 
in the 30S P and A sites, but the acceptor ends have advanced 
to the E and P sites on the SOS, denoted as P/E- and /VP-tRNAs 
(Bretscher, 1968; Moazed and Noller, 1989a). Upon binding 
elongation factor EF-G in the A site, the ribosome can proceed 
to translocate one codon forward (Rodnina and Wintermeyer, 
201 1 ; Savelsbergh et al., 2003). The strict reading frame config- 
uration and the segregation of decoding and translocation help 
keep the translation accurate with an error rate of less than 
0.1% (Drummond and Wilke, 2009). 

However, ribosomes can be programmed to frameshift— ac- 
cessing either of the two out-of-frames (-1 or -r1 frame), thereby 
expanding gene coding capacity on a single transcript (Fara- 
baugh, 1996). Such a mechanism is essential to the virulence 
of compact genomic systems such as HIV-1, where successive 
frameshifts occur on the mRNA to produce a retroviral polypro- 
tein (Jacks et al., 1988). 

Here, we investigate frameshift-programming mRNAs derived 
from the Escherichia coii dnaX gene. Its -1 frameshift efficiency 
in vivo reaches 80%, yielding a 4:1 product ratio between the y 
subunit and the t subunit of DNA polymerase III (Tsuchihashi 
and Brown, 1992). Such translation regulation is achieved by 
three sequence elements in the mRNA: a heptanucleotide slip- 
pery sequence AAAAAAG that is flanked by an internal Shine- 
Dalgarno sequence located 10 nucleotides (nt) upstream and 
an 1 1 base pair (bp) hairpin 6 nt downstream (Figure 1 A). It has 
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Figure 1. Resolving Ribosomal Frame- 
shifting Codon Positions on dnaX-DerIved 
mRNAs 

(A) Three mRNA sequence elements program the 
-1 nt ribosomal frameshift: a slippery sequence, 
AAAAAAG (region in blue), an internal Shine-Dal- 
garno sequence (region in brown), and a down- 
stream hairpin. The cartoon shows the position 
of these elements on the mRNA relative to the 
ribosome. Exact frameshift codon positions are 
indistinguishable due to the identical product 
sequence. A single mutation A4C in the slippery 
sequence (I and II denotes the two 0 frame codons) 
differentiates possible frameshift positions. 

(B) Various —1 -stop-terminated products (se- 
quences ended in purple) from the A4C mutant, 
detected by LC/MS, show that ribosomes frame- 
shift from different codons around the slippery 
sequence, including positions 1-1, II, and III. One 
major frameshifted product, sequenced by LC/ 
MS/MS, bears an extra amino acid in the slippery 
sequence region (Figure SI D and Table S2); thus, 
the ribosome has slipped by —4 nt to enter the -1 
frame. Two degenerate frameshift pathways exist 
to translate such a product (right box): —4 slip at 
codon position I or II (green-shaded rhombus 
area); the latter imposes fewer codon:anticodon 
base pair mismatches (red crosses). 



long been thought that, while the ribosome decodes the 0 frame 
codons in the siippery sequence (A_AAA_AAG), the peripherai 
base-pairing structures on the mRNA serve as barriers to impede 
normai transiation and promote backward frameshifting by 1 nt 
(Figure 1A) (Gesteiand and Atkins, 1996). Specificaiiy, the up- 
stream Shine-Dalgarno sequence hybridizes with the comple- 
mentary anti-Shine-Dalgarno sequence at the 3' end of 1 6S ribo- 
somal RNA (rRNA), thus forming a flexible yet mRNA-anchoring 
mini-helix (Jenner et al., 2007; Kaminishi et al., 2007; Korostelev 
et al., 2007; Yusupova et al., 2006). Downstream, the base-pair- 
ing junction of the hairpin acts as a roadblock situated at the 
mRNA entry site on the ribosome— a single-strand-permitting 
channel formed by three ribosomal proteins: S3 on the 30S 
head and S4 and S5 from the 30S body (Yusupova et al., 2001). 

Recently, perturbed ribosome translation dynamics on the 
slippery sequence has been confirmed and visualized in sin- 
gle-molecule fluorescence resonance energy transfer (FRET) 
experiments (Chen et al., 2014; Kim et al., 2014). Flowever, 
exact details on how the programmed mRNA elements act 
on the ribosome to induce frameshifting dynamics remain un- 
clear (Tinoco et al., 2013). For example, from which 0 frame 
codon does the ribosome frameshift? In which sub-step within 
the translation cycle does the frameshift take place? How 
does the frameshift-programming mRNA break the regular ribo- 
some translation stepping— 3 nt per codon— to promote effi- 
cient and apparently precise frameshifting? Answering these 
questions requires looking beyond ribosome conformational 
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dynamics; we thus sought to examine 
the ribosome translation dynamics on 
the mRNA and to characterize the syn- 
thesized polypeptides. 

Here, we use mass spectrometry (MS) to analyze the products 
of translation along dnaX-derived mRNAs containing wild-type 
or mutant slippery sequences. These analyses reveal that the 
ribosome can access a broad range of frameshift pathways by 
shifting from different codon positions and using various slipping 
sizes. We complement these studies by acquiring single-ribo- 
some translation trajectories using optical tweezers (Wen et al., 
2008) to follow in real-time the ribosome dynamics that accom- 
panies the exploration of alternative frameshift pathways. These 
trajectories display distinctive fluctuations— larger than 1 nt— in 
mRNA displacement during translocation as the ribosome at- 
tempts to overcome the mRNA structural barriers flanking the 
slippery sequence. We found that, after this dynamic explora- 
tion, the ribosome may frameshift, but it is sensitive to mis- 
matches that result from the pairing between the frameshifted 
codons and anticodons. These mismatches likely trigger a fidel- 
ity check mechanism that results in the ribosome either to 
continue translation in a new frame or to prematurely abort 
translation. 



RESULTS 

Frameshifts Occur at Various Codon Positions 

Ribosomes are thought to backshift on the mRNA slippery 
sequence— AAAAAAG for the dnaX gene— and translate to the 
-1 stop codon (Figure 1A) (Farabaugh, 1996). This conjecture 
is based on the fact that, here, both the 0 frame (A_AAA_AAG) 
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and -1 frame (AAA_AAA_G) encode identical amino acids: a pair 
of lysines. Hence, a -1 nt slippage in this region involves minimal 
base-pairing difference between the lysine codons, AAA and 
AAG, and the UUU anticodon used in £. coli (Tsuchihashi and 
Brown, 1992). Note, however, that the resultant tandem lysines 
incorporated in the frameshifted product preclude identifying 
from the polypeptide sequence where exactly the 0 frame 
ends. As shown in Figure 1A, on the dnaX slippery sequence, 
there are three possible decoding routes, all translating the 
same amino acids but via a -1 frameshift at different codon 
positions. To differentiate those potential decoding routes, we 
introduced a slippery sequence variant: A_AAC_AAG, which re- 
tains ~5% frameshift efficiency in vivo (Tsuchihashi and Brown, 
1 992). The single mutation of the fourth adenine to a cytosine lifts 
the encoding degeneracy and yields different amino acid com- 
positions depending on the last-read 0 frame codon position 
(Figure 1A, A4C mutant; the two 0 frame codons in the slippery 
sequence are denoted as positions I and II). Like the original 
dnaX mRNA, the mRNA variant contains an upstream internal 
Shine-Dalgarno sequence and a downstream 25 bp duplex 
equivalently 6 nt downstream from the slippery sequence (Fig- 
ure SI A); we denote this design as the “25 bp” mRNA construct. 
To generate samples for frameshifting analysis, in-vitro-tran- 
scribed and gel-purified mRNAs, together with E. coli 70S 
ribosomes, were added to a reconstituted translation mixture 
(PURExpress ARibosome kit, NEB) (Ohashi et al., 2010). The 
in vitro translation products were then collected and examined 
using liquid chromatography/mass spectrometry (LC/MS) intact 
polypeptide detection (Extended Experimental Procedures). 

In addition to the non-frameshifted, 0-stop-terminated poly- 
peptide (Figure 1 B, bottom-most sequence in green), multiple 
frameshifted products terminating at the -1 stop were identified 
(Figure IB; the last green-colored residue of each sequence 
shows the last read 0 frame codon from which the ribosome 
frameshifts). We found that ribosomes take two of the three 
possible -1-frameshift-decoding routes (at codon position 1-1 
and II), respectively incorporating K_iQ_i or NqKo from the slip- 
pery sequence (subscripts denote the frame); or they switch to 
the -1 frame after the slippery sequence at codon position III 
(Figure IB, fourth sequence from the top). The frameshifted 
polypeptide via slipping at codon position I was, however, not 
observed (Figure SI B). Whereas a recent search for -1 -frame- 
shifted products was limited to two codon positions on the slip- 
pery sequence of HIV-1 (Liao et al., 201 1), here we observed that 
frameshifts in fact emerge from at least three positions. 

We explored two more mRNA templates, bearing either the 
original frameshift-promoting slippery sequence, AAAAAAG 
(wild-type/25 bp; ~80% frameshift efficiency in vivo) or a frame- 
shift-attenuating variant, AAAAGAG (A5G/25 bp; ~0% frame- 
shift efficiency in vivo) (Tsuchihashi and Brown, 1992). In all 
templates examined, independently of the frameshift efficiencies 
attained, the ribosome undergoes -1 frameshifts from a broad 
range of codon positions spanning regions before, within, and 
beyond the slippery sequence (Table SI). 

Ribosomes Frameshift via Various Slip Sizes 

Intriguingly, one polypeptide ~100 Da heavier than the other 
identified -1 -frameshifted products was consistently detected 



in the mass spectrum (n = 5) as a major species translated 
from the A4C mutant mRNA (Figure 2A, top box, largest red 
bar: ~74% of all -1 -stop-terminated products detected). To 
determine the sequence of this unexpected product, we em- 
ployed tandem mass spectrometry (LC/MS/MS) to select and 
fragment the polypeptide (Figure IB, right box; Table S2). We 
found that not two but three amino acids were incorporated 
along the slippery sequence for this unusual - 1 -stop-terminated 
frameshifted product. To translate an extra amino acid while 
switching to the -1 frame, the ribosome must slip by -4 nt dur- 
ing frameshifting. 

The resolved three amino acids, NKQ, can be translated via 
two possible decoding routes: a -4 slip either at codon position 
I or at position II on the slippery sequence (Figure 1 B, right box). 
In the first route, the 0 frame ends at asparagine to yield 
NqK_iQ^i; thus, the last two 0-frame-specified tRNAs carrying 
the alanine and asparagine— after backshifting on the mRNA 
by 4 nt— would encounter three mismatches (red crosses). In 
contrast, the second route would cause only one mismatch for 
the tRNA"^^", suggesting that this route involving a -4 slip from 
codon position II could be the more productive frameshift 
pathway. 

For the other two templates (wild-type/25 bp and A5G/25 bp), 
several, though less abundant, frameshifted products bearing an 
extra amino acid were also detected (Figure S2A and Table SI ; 
relative abundance ~5% and ~3% of overall -1 -stop-termi- 
nated products). These species point to the general capability 
of the ribosome to conduct -4 nt slips on -1-frameshift-pro- 
gramming mRNAs. The presence of -4 slip products led us to 
expand the search for alternative slipping sizes entering the -1 
frame (our template design is capable of detecting only slips 
into the -1 frame and thus precludes readout of potential +1 
frameshifting; see Figure S1A). We found that ribosomes also 
take +2 slips and terminate at the -1 stop, producing frame- 
shifted polypeptides one amino acid short (Figure 2A top box, 
sequences ended in pink; relative abundance ^3% for A4C). 

The Figure 2A top box summarizes the relative abundance 
of all -1 -stop-terminated products detected by LC/MS for the 
A4C mutant template (wild-type and A5G are in Table SI; see 
Extended Experimental Procedures for explanation of abun- 
dance measurements). These findings show that ribosomes 
slip by -1, -4, or +2 nt at various codon positions around the 
slippery sequence region, producing a collection of -1 -stop- 
terminated products. While an earlier work reported that specific 
slipping sizes, e.g., -2, -1 , +2, +5, and +6 nt, can be individually 
programmed by different mRNA templates (Weiss et al., 1987), 
our study shows that various slipping sizes take place on a single 
naturally occurring template. Note that the MS-resolved frame- 
shifted polypeptides reported here would have appeared as a 
single protein band on electrophoresis gels, therefore being 
indistinguishable in earlier studies (Tsuchihashi and Brown, 
1992). 

The -1-stop- and 0-stop-terminated full-length products, 
however, only account for a fraction of the polypeptides 
found in the mass spectrum, e.g., ~39% of the total intensity 
of all species detected from the A4C template (Figure 2A). We 
found the rest to be incomplete polypeptides ended at 0 frame 
codon positions around the slippery sequence and particularly 
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Figure 2. In Addition to Various Stop- 
Codon-Terminated Polypeptides, Frame- 
shift-Programming mRNAs Produce Incom- 
plete Species 

(A) A4C slippery sequence variant (construct with a 
downstream 25 bp duplex) as an example: LC/MS 
detected a broad collection of - 1 -stop-terminated 
products frameshifted from codon positions around 
the slippery sequence; the top bar graph shows 
their relative abundance (x axis). These frameshifted 
species were translated via —1 slips (blue), —4 slips 
(red), and -i-2-slips (pink); the latter two lead to 
polypeptides one amino acid longer or shorter 
(Figure S2). When degenerate decoding routes exist 
(as those shown in Figure 1 B right box; numbers of 
base-pair mismatches for the last two 0 frame 
tRNAs are tabulated here in parenthesis; every non- 
Watson-Crick base-pair scores a 1), we assigned 
thegiven producttoframeshift codon positions with 
fewer mismatches. Incomplete polypeptides ended 
with 0 frame amino acids along the slippery 
sequence were also found (sequences in orange; 
orange peaks in the mass spectrum; Table SI). 

(B) Bottom bar graph: relative abundance (y axis) of 
detected species in the MS spectrum; species are 
organized based on their last 0 frame amino acid 
incorporated, i.e. 0 frame polypeptide length (xaxis). 
Error bar represents SD. A 2D diagram, focusing on 
codon positions around the slippery sequence (SS) 
region, displays from where (x axis) the ribosome 
frameshifts or leaves behind incomplete species. 
With the y axis listing the mRNA nucleotide counts in 
reference to the 0 frame, incomplete species (or- 
ange dots) lie along the diagonal line; the frame- 
shifted products— as located by the first nucleotide 
read in the -1 frame on the mRNA— distribute 
above and below the diagonal line. 



accumulating at positions where the ribosome tends to frame- 
shift (Figure 2A, left orange box). We hence sorted full-length 
and incomplete species by their last-incorporated 0 frame amino 
acids and rearranged the purely mass-based LC/MS spectrum 
into a bar graph ordered by 0 frame polypeptide length (i.e., x 
axis of the bottom graph in Figure 2B); the bar heights depict 
the relative abundance of products detected (SDs from multiple 
measurements shown as error bars). 

To provide a comprehensive view for the various ribosomal 
frameshifting translation events observed around the slippery 
sequence region, we construct a 2D diagram (Figure 2B, top) 
to visualize from which 0 frame codon and to which -1 frame 
codon the ribosome slips. Specifically, the x axis marks the 
last-read 0-frame codon position along the mRNA, whereas 
the y axis— counting mRNA nucleotides in the 0 frame by multi- 
ples of 3— indicates the first nucleotide read in the -1 frame. 
Therefore, incomplete species (orange dots) drop off along the 
diagonal, whereas -1 -stop-terminated products frameshifted 
via plus slips (pink dots) and minus slips (blue and red dots) 
distribute above and below. 

Ribosomes Make Distinctive Translocation Attempts 

Fiaving detected such diverse frameshift pathways via transla- 
tion product analysis, we sought to unravel the molecular mech- 



anisms that give rise to the broad range of ribosomal slippage 
observed. Specifically, how does the ribosome switch the 
mRNA reading frame— i.e., allowing the tRNAs to slip and to 
simultaneously base pair across adjacent codons that are 
spatially kinked into nucleotide triplets by the intercalating 16S 
rRNA residues inside the 30S decoding groove (Yusupova 
et al., 2001)? This issue can be addressed by determining 
when— and how— within one translation cycle the ribosome 
frameshifts. To this end, we employed a real-time in vitro 
mRNA hairpin unwinding assay. Using optical tweezers we 
monitored codon-by-codon translation by a single ribosome 
along the entire frameshift-programming mRNA template 
embedded inside a 92-bp-long hairpin (Figures 3 and S3A) (Qu 
et al., 201 1; Wen etal., 2008). 

Flere, an mRNA hairpin molecule— bound with a single ribo- 
some— is tethered through its two ends and held under tension 
by the optical tweezers (Figure 3). As the ribosome gradually 
translates the mRNA, it must unzip the hairpin by 3 bp per codon. 
As a result, the tether end-to-end distance extends by 6 nt (i.e., 
the gridline spacing: ~2.65 nm/codon on the left y axis in Figures 
4A and 5A trajectory plots), thus reporting the translocation 
movement of the ribosome from one codon to the next. As 
seen from the trajectory, translation occurs in alternating phases 
of translocations and dwells (seen as vertical extensions and 
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horizontal segments). During each dwell, the ribosome decodes 
the A site codon and catalyzes peptidyl-transfer between the P- 
and A-tRNAs (Rodnina et al., 2005; Wohlgemuth et al., 2008). 
The ribosome subsequently binds the GTPase EF-G, partially 
displacing the P- and A-tRNAs into the 503 E and P sites, and 
proceeds with mRNA forward translocation, which requires un- 
zipping the transcript downstream (Qu et al., 2011; Rodnina 
and Wintermeyer, 2011; Savelsbergh et al., 2003; Wen et al., 
2008). To confirm whether the ribosome terminated at the 0 or 
-1 frame stop, after translation ceased, we applied force to 
unfold the remaining mRNA hairpin. Because the two stop co- 
dons result in different untranslated residual hairpin sizes— as 
measured by the mRNA extension gained from unfolding— we 
could verify where the ribosome ended in each trajectory (shown 
schematically in Figure S3B). 

An ~55 bp hairpin remains ahead as the ribosome resides on 
the slippery sequence (Figure S3A) downstream from the internal 
Shine-Dalgarno sequence (with the first codon, i.e., position I, 
of the slippery sequence in the P site) (Qu et al., 2011). We 
chose a “55 bp construct” for the tweezer experiments instead 
of the previously discussed 25 bp construct because the 
longer— and thus more stable— hairpin allows more accurate 
measurements of the termination codon positions. Both frame- 
shift-promoting (wild-type/55 bp) and frameshift-attenuating 
(A5G/55 bp) slippery sequence variants were translated on the 
tweezers. The frameshift efficiencies for these two templates 
were 77% and 57%, respectively, showing a trend similar to 
that observed in vivo (Table 1) (Tsuchihashi and Brown, 1992). 

Significantly, ^90% of the trajectories exhibit distinct fluctua- 
tions in mRNA extension specifically around the slippery seq- 
uence region (orange-shaded area in Figures 4A and 5A; trajec- 
tories recorded at 1 kHz and displayed at 20 Hz). These unique 
signals manifest back-and-forth movements of the ribosome 
on the mRNA > 1 codon on average, distinctively above the 
noise level, and are not observed elsewhere in the trajectory 
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Figure 3. Probing Ribosomal Frameshifting 
Translation Translocation Dynamics Using 
Optical Tweezers 

A single-ribosome translation progression is re- 
ported by the stepwise unwinding of a 92 bp 
mRNA hairpin held on the optical tweezers (see 
also Experimental Procedures); 3 bp are unzipped 
per codon translocated at the hairpin junction, 
thus reflecting displacements between the ribo- 
some and mRNA. When the first 0 frame codon in 
the slippery sequence (codon position I) resides in 
the ribosome 30S P site, a 55 bp hairpin remains 
downstream. Hairpin portions not unwound by the 
ribosome were measured at the end of experi- 
ments; if the ribosome terminates at the — 1 stop, it 
leaves a smaller residual hairpin, as compared to 
that for the 0 stop termination (Figure S3B). Both 
the wild-type slippery sequence and the frame- 
shift-attenuating A5G mutant were examined. 
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(Figures 4A and 5A zoom-ins, a range 
of fluctuation amplitudes appear within 
each trace; Figure 4B table; more exam- 
ples in Figure S4B). These fluctuations are noticeably different 
from the noisy sections recorded when the ribosome nears the 
end of the hairpin, where the tethered mRNA has mostly un- 
wound into single strands and inevitably become much more 
elastic (the increase in noise is illustrated in Figure S4A and is 
corroborated by frequency analysis in Figure S4D). These large 
displacement fluctuations between the ribosome and the 
mRNA around the slippery sequence indicate that multiple 
mRNA translocation attempts occur at this region and that large 
slipping sizes such as -4 nt are indeed attainable. Interestingly, 
we find that fluctuations appear regardless of whether the ribo- 
some ultimately frameshifts or not (Figure 4B table); also, they 
consistently occur even when the sequence is not slippery, 
e.g., on the A5G mutant mRNA (Figure 5A; Figure 4B table for 
lifetimes) whose frameshifting efficiency is reduced. Just as the 
slippery sequence is not the cause of fluctuations, neither is 
the hairpin; even though a hairpin barrier always remains in front 
of the ribosome throughout the entire trajectory, we detect these 
fluctuations only at the region downstream from the internal 
Shine-Dalgarno sequence. Thus, these observations indicate 
that a combination of flanking structural barriers— the upstream 
Shine-Dalgarno:anti- Shine-Dalgarno mini-helix and the down- 
stream hairpin junction— suffice to induce distinctive fluctuating 
ribosome translocation dynamics as the ribosome translates the 
region between the barriers. The barrier-induced, multiple trans- 
location ribosomal excursions directly observed here have been 
indirectly detected in single-molecule fluorescence experiments 
(Chen et al., 2014; Kim et al., 2014). 

To probe the nature of these large and persistent fluctua- 
tions, we characterized their dynamics. The average excursion 
lifetimes— i.e., the time between a backward shift and a forward 
motion (Figures 4A and 5A, zoom-ins) — is ~0.5 s, independent of 
slippery sequence variant and frameshifting outcome (Figure 4B 
table). The distribution of the pooled excursion lifetimes is not a 
single exponential, indicating that more than one rate-limiting 
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A Single-ribosome translation trajectory along wild-type/55-bp hp mRNA construct 




excursion lifetime 
-1 -stop-terminated trace 
0-stop-terminated trace 
fluctuation amplitude 
-1 -stop-terminated trace 
0-stop-terminated trace 



wild-type 
0.48 ± 0.39 s 
0.43 ± 0.25 s 

1.32 ± 0.62 codon 
1.12 ± 0.59 codon 



A5G 

0.46 ± 0.34 s 
1.20 ± 0.85 s 

1 .15 ± 0.38 codon 
0.75 ± 0.35 codon 



Figure 4. Characteristic Fluctuations during 
Ribosome Translocation across the Slippery 
Sequence 

(A) A single-ribosome translation trajectory along 
the frameshift-promoting wild-type slippery seq- 
uence; recorded at 1 kHz and displayed at 20 Hz 
here. Upon each translocation step taken by the 
ribosome (vertical advances along y axis, indicated 
by black arrowheads), the hairpin releases 6 nt per 
codon; this is seen as a 2.65 nm increment (spacing 
between gridlines of the same color) in mRNA end- 
to-end extension under a tension of 18 pN 
(Extended Experimental Procedures). Given the 
mRNA template, amino acids incorporated to the P 
site tRNA after each translocation step are labeled 
next to the gridlines (in letter codes; green for 
0 frame, purple for —1 frame). While the ribosome 
continually translocates against a hairpin, charac- 
teristic fluctuations in mRNA extension (zoom-in 
below) occur downstream from the internal Shine- 
Dalgarno sequence around the slippery sequence 
region (orange-shaded area; Figure S4B). 

(B) The characteristic fluctuations were seen for 
both wild-type and A5G slippery sequence variants 
and both in frameshifted and non-frameshifted 
trajectories, with an amplitude > 1 codon (magenta 
double-headed arrow on the zoom-in trace in panel 
(A) and an average excursion lifetime of ~0.5 s for 
one round of back-and-forth fluctuation (horizontal 
line segments in magenta; see also Figure S4C; n > 
1 0 trajectories analyzed for each of the four cate- 
gories; data are represented as mean ± SD). 



stochastic event controls their duration. Accordingly, from the 
mean lifetime and its SD, we calculated a lower bound for the 
apparent number of rate-limiting steps, nmin, to be ~1 .6 (Fig- 
ure S4C) (Moffitt et al., 2010). This value indicates that at least 
two rate-limiting dynamic events, of similar time scales, are 
required to return from a translocation excursion. 

In addition to the above ~2 Hz dynamics resolved from the 
real-time trajectory data displayed at 20 Hz, we identified faster 
dynamics using power spectrum density analysis (from the 1 kHz 
raw data; Figure S4D). Enhanced fluctuations of 30, 85, and 
180 Hz take place exclusively in the slippery sequence region, 
as compared to elsewhere in the trajectory. These timescales 
are similar to those reported for the 30S body and head dy- 
namics during regular translation — in particular, the head for- 
ward rotation at 80 Hz and reverse rotation at ~4-5 Hz (Guo 
and Noller, 2012). It is thus possible that the fluctuations 
captured at the slippery sequence region in the tweezers data 
reflect the conformational excursions of the ribosome 30S 
head during multiple mRNA forward translocation attempts. 
These fluctuations— which are not present during regular unidi- 
rectional translation on a hairpin— are uniquely promoted by me- 
chanical mRNA barriers flanking the slippery sequence in order 
to achieve frameshifting. 

Translocation Fluctuations Allow Reading Frame 
Sampling 

The large-scale, multiple translocation excursions observed 
across the slippery sequence region provide direct real-time 



insight into the mechanical movements that may be required 
for ribosomes to access the broad range of frameshift pathways 
independently resolved by MS with the 25 bp mRNA constructs. 
To relate these two findings, we analyzed the LC/MS-detected 
polypeptides translated with the same 55 bp mRNA hairpin 
construct used in the tweezer experiments; we show the frame- 
shift-attenuating A5G template as an example (Figure 5B, top 
bar graph). Similarly to Figure 2B, we compiled the frameshift 
pathways identified for A5G/55 bp into a 2D diagram (Figure 5A, 
bottom center; product assignments shown in Figure S5B) and 
aligned the resultant 2D plot with the translocation dynamics 
observed in the single-ribosome translation trajectory (Figure 5A, 
lower-left zoom-in). The two results show clear correspondence 
along the slippery sequence region (orange shaded area). 

Because the ribosome constantly moved back and forth over 
> 1 codon around the slippery sequence (Figures 4A and 5A, 
black-squared sections), it is not possible to pinpoint when along 
the fluctuating trajectory the new frame is established on the 
mRNA sequence. Nonetheless, the zoom-in of the fluctuating 
dynamics does reveal the locations on the mRNA transiently 
visited by the translocating ribosome. The fluctuation magni- 
tude, i.e., the slipping range spanned by the translocating ribo- 
some, coincides with the range of protein products distributed 
in the 2D diagram, supporting the inference that the observed 
multiple translocation excursions reflect the ribosome sampling 
of different reading frames. The correlation between the two 
independently acquired data sets is further strengthened when 
the abundance of frameshift translation products— including 
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Table 1. Apparent Frameshift Efficiency and Overali Frameshift Slipping Attempts 




Wild-Type 




A40 




A5G 




In Vivo Translation Protein Gel Resolved 




11 bp hp 




11 bp hp 




11 bp hp 




Frameshift efficiency (-1 stop) 

(-1 stop + 0 stop) 


80% 




5% 




0% 




In Vitro Translation LC/MS-Detected (n > 3) 




25 bp 


55 bp hp 


25 bp 


55 bp hp 


25 bp 


55 bp hp 


Frameshift efficiency (-1 stop) 

(-1 stop + 0 stop) 


99% 


99% 


33% 


66% 


9% 


35% 


Overall slipping attempts (-1 stop + drop - off) 

( - 1 stop + drop - off + 0 stop) 


99% 


99% 


65% 


87% 


39% 


66% 


Most probable drop-off codon position 




1, II (21%, 29%) 




1 (34%) 




II (58%) 


In Vitro Translation, Single-Ribosome Translation Trajectories 






55 bp hp 








55 bp hp 






n = 134 








n = 216 


Frameshift efficiency (-1 stop) 

(-1 stop + 0 stop) 




77% 








57% 


Overall slipping attempts (-1 stop + aborted) 

(-1 stop + aborted + 0 stop) 




83% 








69% 


Most probable aborted codon position 




1, II (30%, 40%) 








II (58%) 



In addition to the conventionally defined frameshift efficiency — accounting only the - 1 - stop and 0-stop-terminated products (gray rows 1 [Tsuchihashi 
and Brown, 1992], 2, and 5), the overall slipping attempts made by the ribosome are estimated by including the incomplete, i.e., drop-off, species or, 
equivalently, the prematurely stalled and aborted translation trajectories (gray rows 3 and 6). For the two template variants examined (wild-type and 
A5G), the most probable aborted codon positions are consistent with the most probable drop-off codon positions resolved by LC/MS. Some differences 
in frameshift efficiency are seen between in vivo and in vitro translation conditions, which we attribute to known differences in overall translation rates. 



incomplete species— is taken into account. Specificaiiy, the 
iocations around the siippery sequence that were frequentiy 
visited bythefiuctuating ribosome— thus transiently estabiishing 
alternative codon:anticodon base pairing— are also the places 
where the higher populated frameshift translation products are 
found (Figure 5, red arrow). 

Altogether, our findings portray a dynamic frameshifting 
scheme via alternative reading frame sampling, which is ac- 
cessed upon multiple mRNA translocation attempts by the ribo- 
some. Because these distinctive translocation excursions are 
seen only when translating between mRNA structural barriers, 
the energetic cost of breaking codon:anticodon base pairs to 
frameshift is likely partially balanced by the energy liberated at 
the peripheral base-pairing interactions. It is conceivable that 
multiple EF-G binding events (Chen et al., 2014) may also play 
a role in driving the excursions. Furthermore, such a broad 
browsing range is presumably permitted by the swiveling and 
rotating 303 head when the ribosome is in the translocating 
mode— as suggested by the fluctuation frequency analysis 
described above. Indeed, structure studies have shown that, 
first, head tilting rearranges the mRNA binding groove on the 
303 neck by disengaging the 1 63 rRNA residues that intercalate 
on the mRNA (Pulk and Cate, 201 3; 3chuwirth et al., 2005; Tour- 
igny et al., 2013; Zhou et al., 2013); this process should ease the 
spatial restriction that prevents the tRNAs from base pairing 
across adjacent codons in the out-of-frame manner. 3econd, 
303 head rotation is coupled with mRNA forward translocation 



(Dunkle et al., 2011; Ermolenko and Noller, 2011; Gao et al., 
2009; Guo and Noller, 2012; Ramrath et al., 2012, 2013; Ratje 
et al., 2010; Zhang et al., 2009; Zhou et al., 2012); thus, the 
303 head is likely the agent for achieving large mRNA displace- 
ments that facilitate ribosomal frameshifting. A similar mecha- 
nism was proposed in recent kinetic studies, where— averaging 
over an ensemble and presumably over multiple translocation at- 
tempts— prolonged 303 head rotation was observed on a frame- 
shift-programming mRNA (Caliskan et al., 2014). Our results 
hence illustrate a mechanism by which the 303 head rotation is 
perturbed by the flanking mechanical barrier elements, leading 
to multiple translocation attempts that enable frameshifting. 

Not Every Frameshift Attempt Succeeds 

Although the distinctive translocation excursions observed in the 
single-ribosome translation trajectories occur at the slippery 
sequence region, they are independent of the detailed content 
encoded in that sequence. Hence, we wondered how the overall 
distribution of frameshift translation products varies as the ribo- 
some translates different templates. 

For all template variants examined (the “55 bp” mRNA con- 
struct series; Figure 5B, bar graphs), as noted before, incomplete 
species (ended with 0 frame amino acids before reaching the 
0 stop; orange bars) accumulate at codon positions along the 
slippery sequence where the ribosome frequently frameshifts. 
When the slippery sequence becomes “less slippery,” i.e., 
more likely to cause codonianticodon base-pair mismatches 
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1-3 1-2 1-1 1 


li III IV — last read 0-frame 


□ 

LC/MS detected species 




1 codon position 


A5G / 55-bp hp 







Figure 5. Connecting Frameshift Transla- 
tion Dynamics and Product Distribution 

(A) The > 1 codon translocation fluctuations (black- 
squared section on the blue trace; expanded 
underneath) persist in translation trajectory along 
the frameshift-attenuating A5G mutant, occurring 
around the slippery sequence (orange-shaded 
area). Meanwhile, LC/MS detected a wide range of 
frameshift translation species produced from the 
same A5G/55-bp construct, including frameshifted 
and incomplete species (purple and orange dots in 
2D diagram; x axis showing last read 0 frame co- 
dons, y axis marking first read nucleotides in the -1 
frame, relative to those counted in the 0 frame; 
Figure S5B). The accumulation of frameshifted and 
incomplete species at codon position II (the column 
of purple and orange dots indicated by red arrow) 
coincide with the locations on the mRNA slippery 
sequence region that were frequently explored by 
the back-and-forth fluctuating ribosome, as re- 
vealed by the trajectory zoom-in section. 

(B) Relative abundances of LC/MS-detected 
translation products from the 55 bp mRNA con- 
structs— each for the A5G, A4C, and wild-type 
slippery sequence variants representing low, 
medium, and high frameshift efficiency— are 
shown in bar graphs (error bar represents SD). 
Products are sorted by their last 0 frame amino 



incomplete (55%) 
frameshifted (16%) 
non-frameshifted (29%) 



A4C / 55-bp hp 

■ incomplete (71%) 

■ frameshifted (19%) 

■ non-frameshifted (10%) 




1-1 : (3, 0) 
11.1 : (0,2) 



rnmm 





I II 






wild-type / 55-bp hp 

■ incomplete (25%) 

■ frameshifted (75%) 

■ non-frameshifted (<1%) 



m, 

I 





1.1 : (3, 1) 
1-4 : (2,1) 




ll.i:(1,1) 

11.4: (1.0) 




a a 


,t 

;72% 
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acids incorporated along the mRNA (x axis: 
increasing 0 frame polypeptide length), and their 
abundances (Table SI) are plotted along the y 
axis. Less efficient slippery sequence variants 
produce higher amount of incomplete species, 
particularly at codon positions I and II, from where 
most frameshifted species were also translated 
(purple bars). Numbers of base-pair mismatches 
for the frameshifted E- and P-tRNAs are tabulated 
in parenthesis for frameshift pathways at 
codon position I and II, via —1 or —4 slips (in 
subscripts). We count 1 for every non-Watson- 
Crick interaction. 



0-frame polypeptide length 



->29-a,c 



upon a slip, the incomplete species also become more popu- 
lated (Figure 5B; 55%, 71 %, and 25% for A5G, A4C, and wild- 
type, respectively). The equivalents of these aborted translation 
products were also observed among the single-ribosome trans- 
lation trajectories: some of the ribosomes aborted translation 
midway and prematurely stalled around the slippery sequence. 
Based on the residual hairpin sizes measured upon ribosome 
stalling (Figure S3B; position resolution: ± 0.8 nm ^^1/3 codon), 
we found that the most probable aborted codon positions for 
each slippery sequence variant agree with the most abundant 
drop-off species detected by MS (Table 1). The presence of pre- 
maturely stalled ribosomes and aborted polypeptides indicate 
that not every frameshift attempt succeeds in resuming transla- 
tion. Accordingly, our data suggest that, while translocation 



1.1 : (3, 0) 

ll-i:(0, 0) I excursions induced around the slippery 

sequence cause the ribosome to slip out 
of the 0 frame, if no compatible codon:an- 
ticodon base pairing is found, the ribo- 
somes stall and fail to incorporate the 
next amino acid, leading to the generation of incompiete 0 frame 
polypeptides. 

To learn why some frameshift attempts fail, we compared 
the relative abundances of incomplete species to their frame- 
shifted, -1 -stop-terminated counterparts. Forthe A4C mutant 
(Figure 5B, middle bar graph), ~87% of the -1 -stop-termi- 
nated products (purple bars) come from codon position II, 
via -1 and -4 slips. However, a comparable amount of in- 
complete polypeptides (orange bars)— and no frameshifted 
products— was detected at codon position I. Such biases 
over particular frameshift pathways can be explained by the 
number of codon:anticodon base-pair mismatches encoun- 
tered by the two tRNAs on a translocating ribosome. Here, 
we use a dyad notation: (x, y) to annotate the number of 
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Figure 6. During mRNA Translocation: A Dynamic Scheme for Versatile Ribosomal Frameshifting 

Left: after the polypeptide chain (magenta curvy line) transfers from the P- to A-tRNA (blue and red vertical sticks), the elongation factor G (EF-G:GTP; complex in 
cyan and yellow star) catalyzes the P/E-, A/P-tRNAs translocation on the ribosome, along with mRNA (gray dashed line) forward translocation by one codon. This 
mRNA movement is brought by 30S head forward rotation (dark orange counter-clockwise arrow), displacing the E, P, and A site codons (in green, blue, and red) 
to the left (gray downward arrow). To reset the ribosome for next round of translation, the head rotates back (middle cartoon; dark orange clockwise arrow). 
Multiple 303 head rotation— thus back-and-forth mRNA displacement— may be taken to achieve translocation between flanking mRNA structural barriers, e.g., 
Shine-Dalgarno:anti-Shine-Dalgarno mini-helix and downstream hairpin, hence permitting the tRNAs to base pair in alternative frames around the slippery 
sequence. When a new frame is adopted (top row)— at times with mismatches (black crosses)— both the new-frame specified aminoacyl-tRNA (delivered as EF- 
Tu:GTP:aa-tRNA: in orange, yellow star, and red) and the release factors (e.g., RF2, in green) can compete to bind with the mismatch-encountering, frameshifted 
ribosome. In the latter case, the ribosome ceases translation and releases an incomplete polypeptide. 



mismatches at the two frameshifted tRNAs, in the E and P 
sites respectiveiy, before the subsequent -1 frame aa- 
tRNA; every non-canonicai base pairing is given a score of 
1, and G:U wobbies are aiiowed oniy at the third nucieotide 
position. Specificaiiy, both -1 and -4 siips at codon position 
il resuit in fewer mismatches for the two tRNAs, as compared 
to those at codon position I (Figure 5B, middie bar graph, 
numbers of mismatches tabuiated). Therefore, due to better 
aiternative base-pairing options, codon position li on the 
A4C mutant becomes the productive frameshift pathway. 
This anaiysis indicates that, once the ribosome has attempted 
to frameshift, the nucieotide composition aiong the siippery 
sequence dictates the outcome. Whereas the accumuiation 
of incompiete species reveals the frequent slippages attemp- 
ted at codon position I, the specific A4C mutation rendered 
those events unsuccessful. Consequently, the overall frame- 
shift attempts on a given mRNA template should be estimated 
by including the amount of both frameshifted products and 
incomplete species (Table 1). 

In comparison to A4C, the A5G mutant promoted -1 frame- 
shift predominately at codon position I, whereas codon position 
II was a minor pathway (Figure 5B, top bar graph). The former 
slipping route leads to mismatches (tabulated in the graph) 
only at the first tRNA, which moves to the E site on the ribosome 
after translocation. In contrast, in the latter route, enduring only a 
P site mismatch, ~90% of the ribosomes ceased translation (top 
bar graph position II). Thus, P site mismatches appear to incur a 
higher penalty against continuing translation. 



The correlation between the production of incomplete poly- 
peptides and the occurrence of codon:anticodon mismatches 
is consistent with a retrospective fidelity check discovered in 
E. coli (Zaher and Green, 2009). After an amino acid mis-incorpo- 
ration, the 30S subunit can recognize base-pair mismatches 
in its P and E sites as translation errors. The authors showed 
that post-translocation ribosomes— particularly with P site 
mismatches— prematurely terminate translation by recruiting a 
release factor protein, e.g., RF2, into its A site, in competition 
with the A site codon specified aa-tRNA. Upon termination, 
incomplete polypeptides are released as premature drop offs 
from the ribosomes. The correspondence between this retro- 
spective quality control and our observations of prematurely 
aborted polypeptides, in the presence of both RF1 and RF2, indi- 
cate that a similar fidelity control operates during frameshifting 
along the slippery sequence. 

By integrating all of our findings presented here with the cur- 
rent understanding of the bacterial ribosome translation mecha- 
nism, we arrive at the dynamic frameshifting scheme illustrated 
in Figure 6. In response to the flanking mRNA structural barriers 
acting as mechanical restoring devices, the ribosome stochasti- 
cally makes multiple translocation attempts— i.e., excursions— 
promoted by the back-and-forth rotation of its 30S head. As 
each attempt has some probability of success, the succeeding 
translation may resume in a different frame, thereby mediating 
a widely branching set of translation pathways along the slippery 
sequence. The dynamic excursions observed in the single-ribo- 
some translation trajectories not only corroborate the perturbed 



878 Cell 160, 870-881 , February 26, 201 5 ©201 5 Elsevier Inc. 




Cell 



ribosome translocation revealed by ensemble kinetic studies 
(Caliskan et al., 2014) but also further refine their frameshifting 
model by showing that the barrier-hindered ribosome makes, 
in fact, multiple attempts to translocate through the frameshift- 
programming sequence region. We see no evidence for frame- 
shifting pathways via schemes other than translocation excur- 
sions. Hence, although our data cannot rule out models in which 
A site tRNA accommodation can also mediate frameshifting 
(Chen et al., 2014), we adopt the simpler translocation-mediated 
pathway branching mechanism. 

DISCUSSION 

Versatile Pathway Branching Regulates Frameshifting 

Previous studies have indicated that ribosomes are able to 
frameshift despite the creation of mismatches that would nor- 
mally be perceived as translation errors (Atkins and Bjork, 
2009; Farabaugh, 1996; Harger et al., 2002; Tsuchihashi and 
Brown, 1992). It is known that translation accuracy is primarily 
monitored and verified at steps prior to the irreversible peptidyl 
transfer, ensuring proper charging and specific acceptance of 
the cognate aa-tRNA into the A site on the ribosome (Gromadski 
and Rodnina, 2004; Guth and Francklyn, 2007). In the frameshift- 
ing scheme presented here, the potential conflict between fra- 
meshifting and fidelity is alleviated along the mRNA translocation 
step. Any mismatches upon codon:anticodon re-pairing during 
reading frame sampling would occur after the peptide bond for- 
mation (Figure 6) and, thus, it is not susceptible to the fidelity 
controls governing proper mRNA decoding. 

Instead, the retrospective fidelity check by the post-transloca- 
tion ribosome very likely becomes the critical quality control in 
programmed frameshifting— a context that showcases its bio- 
logical significance. This fidelity check subjects the mismatch- 
encountering ribosome to two competing routes— depending 
on the number and site of mismatches created upon a slip, the 
ribosome can either stop synthesis by recruiting a release factor 
or proceed to the next round of amino acid incorporation in the 
new reading frame. Therefore, regardless of how the surrounding 
mRNA structural barriers may kinetically trap the ribosome and 
effectively promote frameshifting, only a fraction of ribosome 
slipping attempts during the dynamic translocation excursions 
succeeds to yield full-length frameshifted products. While 
mutant slippery sequences render a random slipping attempt 
risky, the dnaX slippery sequence— having evolved to offer 
optimal thermodynamic stability for alternative base pairing— fa- 
cilitates passing the fidelity check. 

Our observations illustrate the indispensable role played by 
the slippery sequence— as well as adjacent codon positions 
before and after it— to assure high efficiency of “programmed” 
frameshifting. We can begin to parameterize and predict the 
apparent slipperiness of a given frameshift-programming 
mRNA by considering: (1) for each 0 frame codon within the 
sequence region flanked by the upstream and downstream 
barriers, how many alternative cognate and near-cognate 
base-pairing positions— i.e., slipping routes— exist in the out- 
of-frames; (2) for each of the slipping routes, how feasible is 
the required slipping size, given that it must be attained by 
ribosome 30S head rotation during mRNA translocation. Pro- 



grammed mRNAs with greater totals of alternative codon:antico- 
don base-pairing options, weighted by the ease of the required 
slipping sizes, should exhibit greater frameshift efficiencies. 

The fraction of completed frameshifted products— that 
as found here, can differ in length by a few amino acid resi- 
dues— however, only reports half of the story for programmed 
frameshifting. The missing half is the previously unrecognized 
prematurely terminated polypeptides, whose significance is 
twofold. First, they likely reflect the retrospective fidelity control 
used by the ribosome; second, they represent the relics of un- 
successful slipping attempts induced by the surrounding sec- 
ondary structures. Intriguingly, ribosome stalling and premature 
termination, as the aftermath of unsuccessful slipping attempts, 
have recently been shown to serve as characteristics of frame- 
shift translation, which eukaryotic systems recognize to degrade 
exogenous programming mRNAs (Belew et al., 201 4). Therefore, 
these impeded slipping attempts— ultimately leading to the “off- 
pathway” incomplete translation products— may have profound 
implications for the regulation both of translation and of mRNA 
abundance inside cells. 

The overall frameshift attempts— composed of both the 
completed and the heretofore hidden aborted species— hence 
are the genuine measure for the frameshift-promoting strength 
of a programmed mRNA. In turn, this strength is determined by 
the mechanical properties of the mRNA structures that flank 
the slippery sequence and that interfere with the normal transla- 
tion cycle. In this work, we have utilized these mechanical prop- 
erties as a probe in situ: when the ribosome translates between 
the structural barriers, its intrinsic translocation dynamics are 
uniquely amplified, permitting a ribosomal slip and possibly 
engaging the retrospective fidelity check. Using optical twee- 
zers, we have captured the underlying dynamics of frameshifting 
translation in real time, providing a glimpse of an unexpectedly 
versatile translation scheme. 

EXPERIMENTAL PROCEDURES 

In vitro translation was performed using either the PURExpress ARibosome kit 
(NEB) to synthesize polypeptide samples in large scale and analyze by MS or a 
custom-made reconstituted reaction mixture for real-time single-ribosome 
translation in the optical tweezer experiments. The same preparations of 
purified E. coli MRE600 ribosomes and mRNA constructs were used in both 
experiments. 

Exact composition and protocois of the PURExpress system have been 
documented previously (Ohashi et al., 2010; see also Extended Experimental 
Procedures). All LC/MS experiments were performed on an LTQ Orbitrap XL 
mass spectrometer (Thermo Scientific) connected with an Agilent 1200 nano- 
flow HPLC system by means of nanoelectrospray. MS full scans were acquired 
in the Orbitrap analyzer (using internal lock mass recalibration in real time), 
whereas tandem mass spectra were recorded in the linear ion trap. Each trans- 
lation product was identified within 10 ppm deviation in mass (Da)— and with 
verification of its unique isotope pattern originated from specific amino acid 
composition. 

Reconstituted reaction mixtures for tweezer experiments include elongation 
factors (EE-G and EE-Tu), release factors (RE1 and RE2), GTP, etc., along with 
selectively charged aminoacyl-tRNAs to fuel translation inside a micro-fluidic 
chamber. We first tether one ribosome-mRNA complex (single ribosome initi- 
ated and halted on the long mRNA hairpin construct)— via a pair of terminal- 
modified dsDNA handles— between two 2.1 rim, surface-modified polystyrene 
beads in the tweezers setup (Figure 3A). Before beginning translation, we 
register the starting position of the ribosome on the mRNA based on the 
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downstream hairpin size measured; this measurement is done by applying 
force to unfold and refold the hairpin. Upon flowing the above mixture into 
the chamber while holding the mRNA tether at constant force, we commence 
recording stepwise hairpin unwinding that reports single-ribosome translation 
in real time. After the course of translation, we again measure the residual 
hairpin to verify the ribosome termination position. Detailed protocols for pro- 
tein factor and material preparations, ribosome-mRNA complex formations, 
configurations of the single-trap optical miniTweezers (S.B. Smith, Tweezer- 
sLAB), step-by-step instrumental operations, and single-ribosome translation 
trajectory data analysis have been described in the literature and our previous 
studies {Dingbas-Renqvist et al., 2000; Qu et al., 201 1 ; Wen et al., 2008; see 
also Extended Experimental Procedures). 
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SUMMARY 

Evolvability — the capacity to generate beneficial her- 
itable variation — is a central property of biological 
systems. However, its origins and modulation by 
environmental factors have not been examined sys- 
tematically. Here, we analyze the fitness effects of 
all single mutations in TEM-1 3-lactamase (4,997 var- 
iants) under selection for the wild-type function 
(ampicillin resistance) and for a new function (cefo- 
taxime resistance). Tolerance to mutation in this 
enzyme is bimodal and dependent on the strength 
of purifying selection in vivo, a result that derives 
from a steep non-linear ampicillin-dependent rela- 
tionship between biochemical activity and fitness. 
Interestingly, cefotaxime resistance emerges from 
mutations that are neutral at low levels of ampicillin 
but deleterious at high levels; thus the capacity to 
evolve new function also depends on the strength 
of selection. The key property controlling evolvability 
is an excess of enzymatic activity relative to the 
strength of selection, suggesting that fluctuating en- 
vironments might select for high-activity enzymes. 

INTRODUCTION 

Biological systems are often regarded as remarkably tolerant 
of genetic perturbations. Mutational robustness has been 
observed at nearly all levels of biological organization, from pro- 
tein structure and function (McLaughlin et al., 2012; Rennell 
et al., 1991 ; Suckow et al., 1996) to metabolic flux (Kacser and 
Burns, 1981) to regulation of gene expression (Wagner, 2005a) 
to development (Waddington, 1942, 1953). Because processes 
that buffer the effects of genetic variation inevitably have conse- 
quences for evolutionary outcomes, an understanding of both 
the causes and consequences of robustness is of central impor- 
tance to biology (de Visser et al., 2003; Masel and Siegal, 2009; 
Wagner, 2005a). For example, recent studies (Draghi et al., 2010; 
Hayden et al., 2011; Payne and Wagner, 2014; Wagner, 2008) 
show that robustness can facilitate the adaptation of biological 
systems to environmental change— a property sometimes called 
“evolvability” (Kirschner and Gerhart, 1998). Besides being 
essential to basic evolutionary theory, understanding robustness 
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may also be important in the engineering of useful proteins that 
are more resilient to the effects of random mutations and for un- 
derstanding the emergence of mutations that impact human 
health. 

Due to their importance in defining phenotypes at the cellular 
and organismal level, and the ease with which large numbers 
of mutations can be introduced and assessed, proteins repre- 
sent ideal model systems for studying robustness and evolvabil- 
ity. The tolerance of many proteins to mutations has been as- 
sessed in a number of important studies with high-throughput 
site-directed and random mutagenesis (Fowler et al., 2010; 
Guo et al., 2004; Huang et al., 1996; Jacquier et al., 2013; 
Loeb et al., 1989; McLaughlin et al., 2012; Melnikov et al., 
2014; Rennell et al., 1991; Suckow et al., 1996). Overall, these 
studies suggest that the function of proteins is insensitive to 
the vast majority of amino acid changes (Bowie et al., 1990). By 
contrast, it is generally accepted that many missense mutations 
have measurable biophysical effects (e.g., on protein stability), 
supporting a view that most mutations are not neutral at the 
biochemical level (DePristo et al., 2005; Tokuriki and Tawfik, 
2009). One possible explanation for this discrepancy is that 
robustness and evolvability are characteristics that ultimately 
refer to organismal fitness, a property that is difficult to assess 
and whose relationship to biochemical parameters of a protein 
could be complex and is generally unknown. Indeed, many 
comprehensive studies of mutational tolerance in proteins have 
assessed biochemical traits (e.g., protein-binding affinity; Fowler 
et al., 2010; McLaughlin et al., 2012) or other phenotypes (e.g., 
minimal inhibitory concentration of antibiotic, MIC; Firnberg 
et al., 2014; Jacquier et al., 2013); although much has been 
learned from these studies, the relationships between these 
properties and fitness are less clear. 

More fundamentally, it is logical that the relationship between 
organismal fitness and biochemical parameters might vary signif- 
icantly with the strength of selective pressure acting on the pro- 
tein. If so, then robustness and evolvability must not be considered 
as absolute, invariant features of proteins but instead as proper- 
ties that depend on environmental or experimental conditions 
that control purifying selection. To examine these ideas rigorously, 
we require (1) a quantitative mapping of the relationship between 
in vivo fitness and in vitro biochemistry in an appropriate model 
system, (2) a study of how both mutational sensitivity to existing 
function and the capacity to evolve new function depend upon se- 
lection pressure, and (3) a mechanistic principle for how these 
characteristics emerge from the properties of extant proteins. 

CrossMark 
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Figure 1 . Experimental Scheme 

(A) Strategy for comprehensive assessment of the 
fitness effects of mutations in TEM-1 (Experi- 
mentai Procedures and Extended Experimentai 
Procedures). Each codon corresponding to the 
263 amino acid positions of the mature TEM-1 
protein was individuaiiy mutated to NNS (where N 
is a mixture of aii four nucieotide bases, and S a 
mixture of C and G) using PGR. PGR products 
were combined in equimoiar ratios, cioned into the 
pBR322 piasmid, and transformed into E. coli. The 
resuiting whoie-gene saturation mutagenesis ii- 
brary was then seiected in ampiciiiin at either 
0 |.i,g/mi (unseiected iibrary) or various concentra- 
tions (seiected iibrary). iiiumina 75 bp paired-end 
sequencing was used to obtain aiieie counts N for 

each amino acid mutation a at every position i and under each seiection condition; reiative fitness effect Ff is assessed as the iogarithmic increase in aiieie counts 
in the seiected iibrary versus the unseiected iibrary, reiative to the wiid-type aiieie. 

(B) Resuits of growth assays (n = 3) for E. coli ceiis harboring wiid-type TEM-1 under seiection at various concentrations of ampiciiiin, indicating that growth is 
unaffected at ampiciiiin concentrations < 2,500 ng/mi (Extended Experimentai Procedures). 

See aiso Pigure S1. 





The problem of how robustness and evolvability are condi- 
tional on selection strength can be investigated in proteins 
conferring antibiotic resistance by comprehensive experi- 
mental assessment of the fitness effects of mutations as a 
function of antibiotic concentration. Here, we focus on a 
powerful model system for protein evolution studies, TEM-1 
p-lactamase (Bershtein et al., 2006; Salverda et al., 2010; 
Weinreich et al., 2006). TEM-1 is an enzyme that hydrolyzes 
penicillin-class p-lactam antibiotics (e.g., ampiciiiin); the ability 
of bacteria to survive and reproduce (i.e., fitness) in the pres- 
ence of these antibiotics relies solely on TEM-1 activity (Mat- 
agne et al., 1998; Medeiros, 1997). Using an application of 
deep sequencing (Fowler et al., 2010; McLaughlin et al., 
2012), we determined the effects on organismal fitness of all 
single amino acid mutations in TEM-1 (4,997 mutations) under 
selection for ampiciiiin resistance (the wild-type function) and 
for resistance to cefotaxime (a new function). By assessing 
fitness under increasing concentration of ampiciiiin, we 
demonstrate that robustness indeed depends strongly and 
inversely on the strength of purifying selection. The pattern 
of mutational sensitivity is hierarchically organized in the 
atomic structure, building out from the active site as a function 
of ampiciiiin concentration in physically contiguous but aniso- 
tropic amino acid networks. Furthermore, we find that in TEM- 
1 , evolvability is facilitated by robustness and therefore is also 
dependent on the strength of selection; mutations conferring 
cefotaxime resistance are predominantly neutral under selec- 
tion at low to moderate ampiciiiin concentrations yet delete- 
rious to fitness under high concentration. To mechanistically 
understand these properties, we propose a simple kinetic 
model that describes the fitness effects of all mutations in 
our study as a function of ampiciiiin concentration and intra- 
cellular p-lactamase activity. This model shows that, mecha- 
nistically, robustness and evolvability can be understood to 
arise from an excess of intracellular enzymatic activity relative 
to the strength of selection, a finding that suggests a role for 
fluctuating environments in the evolution of high-activity 
enzymes. 



RESULTS 



Whole-Gene Saturation Mutagenesis and Fitness under 
Ampiciiiin Selection 

We used site-directed mutagenesis to create a whole-gene satu- 
ration mutagenesis library comprising all 19 possible single-site 
amino acid point mutations at every position in the mature form 
of TEM-1 p-lactamase (4,997 amino acid mutations total; Exper- 
imental Procedures and Extended Experimental Procedures; 
Figure 1 A). To assess the effects of selection strength on robust- 
ness, the library was transformed into £. coli and selected at 
several concentrations of ampiciiiin ranging from zero to a 
concentration just below that at which cells encoding even 
wild-type TEM-1 decline considerably in fitness (2500 pg/ml 
ampiciiiin; Extended Experimental Procedures; Figure IB). 

Iiiumina 75 bp paired-end sequencing was used to obtain 
counts for each mutant allele after selection at each ampiciiiin 
concentration; an average of 1 ,000 counts per amino acid muta- 
tion was obtained under conditions of no selection (0 pg/ml 
ampiciiiin; Figure SI A). The relative fitness Ff of each amino 
acid mutation a at each position / is assessed as the logarithm 
of the allele counts in the selected population versus the 
unseiected population (0 |.i,g/ml ampiciiiin, relative to 

the wild-type allele: 



Ff = logic 



■ ■ 
yya,unse/ 



- lOQi 



A/r 



AT'- 



( 1 ) 



Mutations that show no fitness effect have values of Ff close to 
that of wild-type (Ff = 0), and those with an increase or decrease in 
fitness relative to wild-type have a positive or negative value of Ff, 
respectively, in proportion to their effect. The values of Ff are 
generally reproducible over two independent trials (i^ = 0.91 at 
2500 pg/ml ampiciiiin; Figure SIB), and effects due to codon 
bias under ampiciiiin selection appearto be small (r^ = 0.96 for rela- 
tive fitness between all synonymous mutations at 2500 pg/ml 
ampiciiiin; Figure SIC). Examination of the distribution of Ff for 
all mutations under no selection (0 pg/ml ampiciiiin) provides a 
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basis for defining cutoffs for Ff corresponding to statistically 
neutral effects on fitness (mean ± two SD in Ff; Figure SI D). 

Robustness Is Conditional on the Strength of Purifying 
Selection 

Figure 2 shows the fitness effect of all single amino acid 
mutations in TEM-1 at all ampicillin concentrations examined 
(10, 39, 156, 625, and 2500 pg/ml; see also Table SI). Under 
weak selection at a low ampicillin concentration (39 pg/ml; no se- 
lection was observed at 10 pg/ml; Figures 2A and 3A), the vast 
majority of mutations are statistically neutral (Ff = 0), and only 
a small fraction of mutations significantly affect fitness (Figures 
2B and 3B). These include highly conserved positions within 
the active site (S70, K73, SI 30, D131, N132, K234, and G236; 
numbering according to Ambler; Ambler et al., 1991) but also a 



Figure 3. Distribution of Fitness Effects 

(A-E) Histograms of Ff values show that the dis- 
tribution of the fitness effects (DFE) is bimodai and 
depends on the strength of purifying seiection: (A) 
10, (B) 39, (C) 156, (D) 625, and (E) 2,500 ^g/mi 
ampiciiiin. Red iines are heuristic fits to a bi- 
Gaussian function. The range of Ff that corre- 
sponds to a statisticaiiy neutrai fitness effect is 
indicated in gray, insets show DEEs eniarged over 
the range 0-0.1 . 

(E and G) The pattern of mutationai sensitivity 
invoives spatiaiiy heterogeneous yet physicaiiy 
connected networks of residues, buiiding out from 
the active site and core as the strength of purifying 
seiection is increased. Shown are (F) surface and 
(G) siice representations of the pattern of sensi- 
tivity to singie-site amino acid point mutations at 
each ampiciiiin concentration mapped onto the 
structure of TEM-1 (PDB: 1 FQG). Coiored spheres 
indicate residues with a significant positionai 
fitness effect (see Figure S2); different coiors 
indicate resuits at each ampiciiiin concentration. 
Co-crystaiiized p-iactam (peniciiiin G) is shown 
as yeiiow stick bonds. See aiso Figure S2 and 
Tabie S2. 



subset of more moderately conserved 
positions distributed within the protein 
core (Figures 3F and 3G). As the ampi- 
cillin concentration used for selection is 
further increased, the overall fitness cost 
of mutations dramatically increases— 
both in the number of mutations that 
show a fitness effect and in the degree 
of their effect relative to wild-type (Fig- 
ures 2C-2E and 3C-3E). To some extent, 
these results seem obvious; it is reasonable to expect that the 
fitness cost of mutations in an enzyme will depend on the 
strength of purifying selection for its function. In contrast, no sin- 
gle mutations at any concentration significantly increase fitness, 
indicating that TEM-1 occupies a local peak in its genotype- 
fitness landscape under the conditions of these experiments. 

More interestingly, the distribution of the fitness effects of mu- 
tations (DFE) is decidedly bimodai, with one mode correspond- 
ing to mutations with significant deleterious effects on fitness, 
and the other mode comprising those with neutral or nearly 
neutral fitness effects (Figures 3A-3E and Table S2). The 
strength of ampicillin selection controls the fraction of mutations 
in these two modes; increasing antibiotic concentration causes 
the relative proportion of mutations in the mode with deleterious 
mutations to increase and in the mode with neutral/nearly neutral 




Figure 2. Fitness Effects of All Single Amino Acid Mutations in TEM-1 under Increasing Ampicillin Selection 

Shown are the data matrices containing the reiative fitness effect for every amino acid mutation at every position in TEM-1 under seiection with ampiciiiin at (A) 1 0, 
(B) 39, (C) 1 56, (D) 625, or (E) 2,500 ng/mi. Rows within each matrix depict positions aiong the primary sequence of TEM-1 , and coiumns indicate a mutation to one 
of the 20 amino acids (in aiphabetical order by one-letter code indicated at bottom of A). Relative fitness effect is depicted coiorimetricaily with biue representing a 
deieterious fitness effect, red a positive fitness effect, and white no fitness effect reiative to wiid-type. Positions for which no data were obtained at 10 rig/mi 
ampiciiiin are colored gray. The secondary structure of the wild-type sequence is indicated to left of (A). Several highly conserved motifs within the active site 
(S70-X-X-K73, S130-D131-N132, El 66, and K234-S235-G236) are indicated to the right of (E). See also Table S1. 
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Figure 4. A Phenotype-Fitness Landscape 
for TEM-1 

(A) Overview of the kinetic modei describing 
fitness as a function of ampiciliin concentration 
and intraceliuiar p-iactamase activity (Experi- 
mentai Procedures and Extended Experimental 
Procedures). The peripiasmic concentration of 
ampiciiiin (green) is determined by the rate of 
passive diffusion across the outer membrane and 
the rate of inactivation (hydroiysis, denoted by X) 
by 3-iactamase (red). Growth (i.e., fitness) is pro- 
portionai to the rate of peptidogiycan strand- 
joining (yeiiow) by PBPs (orange); PBP activity, 
and thus fitness, is inhibited (dashed iine) by un- 
hydroiyzed ampiciiiin in the peripiasm. 

(B-E) Surface representation of the model 
revealing a non-linear relationship between 
phenotype (intracellular p-lactamase activity: 
and i/max) and fitness marked by a plateau at 
neutrality (Ff = 0) that is dependent on the strength 
of selection: (B) 39, (C) 156, (D) 625, and (E) 
2,500 ng/ml ampiciiiin. Model estimated values of 
kinetic parameters and fitness for wild-type TEM-1 
(green sphere) and all 4,997 single mutations (sil- 
ver spheres) are shown. 

See also Eigure S3 and Table S3. 




mutations to decrease. In addition, to different extents, both 
modes shift toward a greater average fitness cost as a function 
of ampiciiiin concentration (Tabie S2). Thus, these data (Figures 
2 and 3) reveai that robustness in TEM-1 is strongly dependent 
on the strength of selection — under weak selection, TEM-1 is 
more robust as most mutations are neutral or nearly neutral in 
their fitness effects. But as selection is increased, robustness 
concomitantly decreases as more and more mutations become 
deleterious to fitness. 

Mapping of the data onto an atomic structure of TEM-1 reveals 
a spatially anisotropic pattern of amino acid contributions to 
organismal fitness (Figures 3F and 3G). Here, we describe the 
average fitness cost of all mutations at each position {{Ff)^) at 
each ampiciiiin concentration; for comparison, positions sensi- 
tive to mutation are defined based on the distribution of (Ff)^ un- 
der selection at 2500 pg/ml ampiciiiin (Figure S2). Under weak 



selection (39 pg/ml ampiciiiin), mutation- 
sensitive positions comprise a physically 
contiguous but anisotropic network of 
residues buried within the protein core 
and extending out from the active site. 
As the level of selective pressure is 
increased, this main “functional core” ac- 
quires successive shells of mutation-sen- 
sitive residues that grow outward until 
nearly the whole protein core shows 
some degree of fitness cost upon muta- 
tion. Nevertheless, the heterogeneity in 
positional contribution to fitness persists 
even at the highest levels of ampiciiiin; 
the most mutationally sensitive positions 
at 2500 pg/ml ampiciiiin are similar to 
the functional core defined at 39 pg/ml ampiciiiin (Figure 7, 
compare panels D and F). 

Robustness as an Excess of Intracellular Enzymatic 
Activity Relative to the Strength of Purifying Selection 

To study the mechanistic basis for both the dependency of 
robustness on selection strength and the bimodality of the 
DFE, we developed a simple kinetic model describing relative 
fitness as a function of ampiciiiin concentration and intracellular 
p-lactamase activity (Figure 4A, Experimental Procedures, and 
Extended Experimental Procedures). In this model, organismal 
fitness is proportional to the flux of peptidogiycan substrate 
through DD-carboxypeptidases and other penicillin-binding pro- 
teins (PBPs) involved in cell-wall biogenesis, with ampiciiiin 
acting as a competitive inhibitor of this process. The peripiasmic 
concentration of ampiciiiin is dynamically set by the balance of 
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intracellular p-lactamase activity and passive diffusion of ampi- 
cillin across the outer membrane (Zimmermann and Rosselet, 
1977). We determined values for the free parameters of our 
model by globally fitting the experimental data obtained for all 
4,997 mutations (Extended Experimental Procedures and Table 
S3). The free parameters uniquely converge, and the model fits 
the data well with an overall r^ = 0.98. 

The main result of our model is the finding of a non-linear 
phenotype-fitness relationship in which fitness saturates at 
different levels of enzyme activity as a function of the concentra- 
tion of applied ampicillin (Figures 4B-4E). The non-linearity 
arises from two sources: the steady state achieved between 
diffusion and hydrolysis of ampicillin and the competitive inhibi- 
tion of the PBPs. This result provides a simple explanation for 
both the dependence of mutational robustness on selection 
strength and the bimodal nature of the DFE. In the model, 
enzyme activity of every mutant is defined by two parameters, 
the maximum intracellular rate of ampicillin hydrolysis (1/max) 
and the concentration of ampicillin that produces half-maximal 
rate (K^). Due to the high overall activity of TEM-1 for ampicillin, 
the wild-type enzyme resides well within the saturated regime 
(plateau) of the fitness-activity relationship. Mutational robust- 
ness emerges as a consequence because within the saturated 
region, changes in enzyme activity due to mutation have negli- 
gible effects on fitness (de Visser et al., 2003; FlartI et al., 1985; 
Kacser and Burns, 1981; Wagner, 2005a). The model is also 
consistent with the dependence of robustness on the strength 
of selection because fitness saturates at increasingly higher 
levels of activity as the ampicillin concentration increases (Fig- 
ures 4B-4E, S3A, and S3B). That is, robustness collapses 
steadily with increasing selection pressure because the fitness- 
activity relationship depends on the steady-state levels of ampi- 
cillin in the periplasm. Finally, the bimodal nature of the fitness ef- 
fects of mutations can be understood as a direct consequence of 
the steep non-linearity relating enzyme activity to fitness. For 
example, mutational variation in enzyme activity has basically 
two outcomes: to stay in front of the non-linearity and have a 
relatively negligible effect on fitness, or to cross the non-linearity 
“threshold” and display a profound effect on fitness (see Figures 
S3C-S3G for a simulation). Further contribution could come from 
inherent non-linearities in the effects of mutations on TEM-1 ac- 
tivity, but the model shows that such is not required for bimo- 
dality in the DFE. In short, the model shows that robustness 
and its dependency on the strength of selection arise from the 
high intracellular activity of TEM-1 and the ampicillin-dependent 
non-linear saturation relationship between enzyme activity and 
fitness. 

A general model for mutational robustness in proteins has 
been previously proposed based on the thermodynamic stability 
of proteins. The idea is that the “extra” stability beyond that 
required to asymptotically populate the native state provides a 
thermodynamic basis for robustness by buffering the slightly de- 
stabilizing effects of most mutations (Bloom et al., 2005; Tokuriki 
and Tawfik, 2009; Wylie and Shakhnovich, 2011). We note that 
our conclusions are not inconsistent with this view; the overall 
rate of ampicillin hydrolysis in vivo is a combination of both the 
fraction of natively folded (3-lactamase protein and the specific 
activity of the native state, and mutations could influence either 



or both properties. More generally, we propose that robustness 
comes from an excess of intracellular enzymatic activity relative 
to the fitness threshold present at a particular strength of selec- 
tion— a description that combines the probability of native-state 
folding with the biochemical parameters controlling catalytic 
power and accounts for the dependency of robustness on the 
strength of selection. 

Evolvability and Fitness under Cefotaxime Selection 

Robustness implies invariance of organismal fitness upon muta- 
tion, which at first glance suggests that robust biological systems 
should have a decreased ability to evolve new phenotypes, or 
decreased evolvability. On the other hand, mutations that are 
neutral with regard to the current or wild-type function might be 
able to promote new functions; indeed, such “conditional 
neutrality,” whether with respect to genetic background or envi- 
ronmental factors, has been suggested to facilitate evolvability 
by permitting the accumulation of mutations that could be useful 
upon changes in selective pressure (de Visser et al., 2003; Kirsch- 
ner and Gerhart, 1998, 2005; Masel and Trotter, 2010; Wagner, 
2005a, b). To assess the relationship between robustness and 
evolvability in TEM-1, we performed selection on our whole- 
gene saturation mutagenesis library in the presence of a different 
|3-lactam drug, cefotaxime. Cefotaxime is a poor substrate for 
TEM-1 , with an approximately 1000-fold decrease in k^at/Km for 
cefotaxime versus ampicillin (=10"* versus =10^ M“^s“\ 
respectively). As such, TEM-1 imparts no significant fitness 
advantage; the minimal inhibitory concentration (MIC) of cefotax- 
ime for £. coli cells encoding wild-type TEM-1 is essentially un- 
changed from that of cells without p-lactamase (0.0625 |ig/ml) 
(Flail, 2002). However, single amino acid changes in TEM-1 are 
known to increase resistance to cefotaxime both in nature and 
in the laboratory (Matagne et al., 1998; Salverda et al., 2010). Se- 
lection was performed at 0.15 pg/ml cefotaxime (approximately 
double the MICfor TEM-1), and the fitness of each mutation rela- 
tive to wild-type TEM-1 determined as described above. 

Figure 5 shows the fitness effect of all single mutations under 
cefotaxime selection, and Figure S4 shows the corresponding 
DFE (see also Table S4). In contrast to the results obtained under 
ampicillin selection, no mutations show a significant fitness 
decrease relative to wild-type TEM-1 , a result that simply reflects 
the already poor activity of TEM-1 on cefotaxime. However, a 
small number of mutations (106 total, or 2%) act to increase 
fitness; among these we observe alleles previously reported to 
impart an extended-spectrum phenotype in both TEM-1 and/or 
its homolog SHV-1 in clinical isolates (E104K, R164H, R164S, 
D179G, D179N, G238A, G238S) (Matagne et al., 1998; Salverda 
et al., 2010) and error-prone PGR libraries (Bershtein and Tawfik, 
2008; Schenk et al., 2012). 

Robustness and Adaptation in TEM-1 

A comparison of the relative fitness effects obtained for each 
mutation at each ampicillin concentration versus their respective 
effects under cefotaxime selection reveals how the robustness 
of TEM-1 under ampicillin selection (the current or wild-type 
function) relates to its evolvability toward cefotaxime resistance 
(a new function) (Figures 6A-6D). Under weak selection for ampi- 
cillin resistance (e.g., 39 pg/ml ampicillin, Figure 6A), nearly all 
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Figure 5. Fitness Effects of All Single Amino Acid Mutations in 
TEM-1 under Cefotaxime Selection 

Shown is the data matrix depicting Ff for each amino acid mutation at each 
position in TEM-1 under selection with the non-optimal TEM-1 substrate 
cefotaxime at 0.15 pg/ml. Labels and coloring are as in Eigure 2. Several po- 
sitions previously reported to confer increased cefotaxime resistance upon 
mutation in TEM-1 or its homolog SHV-1 are indicated at right (El 04, R164, 
D179, and G238). See aiso Eigure S4 and Tabie S4. 



mutations conferring significant cefotaxime resistance are sta- 
tisticaily neutrai in their fitness effect in ampiciiiin; this inciudes 
aii the above-stated mutations found in clinicai isolates. These 
mutations are said to be conditionally neutral with respect to 
the environment (as their phenotypes depend on the condition 
of selection) and have been linked to the rate of evolution of 
new phenotypes (Draghi et al., 201 1 ; Wagner, 2005b). However, 
under selection at increasing ampiciiiin concentration, mutations 
conferring cefotaxime resistance have progressively deleterious 
fitness effects in ampiciiiin; for example, the mutations R164H, 
R164S, D179G, D179N, and G238S, which are neutral at 
39 pg/ml ampiciiiin, now have significant deleterious fitness ef- 
fects at 2,500 pg/ml ampiciiiin (Figure 6D). That is, the neutrality 
of these mutations is itself conditional on the strength of selec- 
tion, present at low levels of ampiciiiin and diminishing at higher 
levels. Thus, for TEM-1 , more useful genetic variation (mutations 
conferring cefotaxime resistance) is proportionately available 
when robustness is high (weak ampiciiiin selection) than when 
robustness is low (strong ampiciiiin selection; Figure 6E). 

To test this in an independent experiment, we created a library 
of TEM-1 variants by error-prone PGR (average of 1 ± 1 muta- 
tions per gene. Extended Experimental Procedures), trans- 
formed into E. coli, and screened for growth of the population 
on cefotaxime (0.2 pg/ml) either with prior selection (Figures 
6F, S5A, and S5B) or with co-selection (Figures S5C and S5D) 
on varying doses of ampiciiiin (Extended Experimental Proce- 
dures). The data show that in fact, the growth of the population 
of TEM-1 variants on cefotaxime is a function of ampiciiiin expo- 
sure and very nearly reflects the fraction of mutations that confer 
cefotaxime resistance but are statistically neutral for ampiciiiin in 
the comprehensive single-mutation library (Figure 6E). We 
conclude that in TEM-1, robustness enhances evolvability by 
permitting environmentally conditional neutral mutations that 
can confer cefotaxime resistance. 

The Spatial Distribution of Cefotaxime Adaptation 
in TEM-1 

Mutations conferring initial cefotaxime resistance are distributed 
broadly throughout the tertiary structure of TEM-1 and include 
residues both in the core and on the surface of the protein (resi- 
dues in dark gray. Figure 7). However, there is an interesting 
and physically informative pattern to the organization of these 
mutations. Few cefotaxime-adaptive mutations are directly within 
the functional core defined by residues with significant fitness ef- 
fects at 39 pg/ml ampiciiiin (residues in red. Figures 3G, 7A, and 
7D), consistent with the finding that they are largely neutral in 
fitness effect under weak ampiciiiin selection (Figure 6A). But 
remarkably, the mutations conferring cefotaxime resistance are 
organized into sparse, physically connected networks of amino 
acids that connect a subset of surface positions to the functional 
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Figure 6. Robustness and Evolvability of 
TEM-1 

(A-D) Comparison of Ff values for all mutations 
under ampicillin (x axis) versus cefotaxime selec- 
tion (0.1 5 |ig/ml; y axis) shows that most mutations 
conferring significant cefotaxime resistance are 
statistically neutral in fitness effect at a low ampi- 
cillin concentration but deleterious at the highest 
ampicillin concentration. Mutations that confer 
cefotaxime resistance and are either statistically 
neutral (ctx^^^amp^®*^^) or deleterious (ctx^^^amp^®') 
in fitness effect under selection at the indicated 
ampicillin concentration are colored red and green, 
respectively; (A) 39, (B) 156, (C) 625, and (D) 
2,500 |.ig/ml ampicillin. Several mutations previ- 
ously reported to impart an “extended-spectrum” 
phenotype in both TEM-1 and/or its homolog SHV- 
1 in clinical isolates are indicated. 

(E) A comparison of the fraction of statistically 
neutral mutations that confer cefotaxime resis- 
tance at each ampicillin concentration as deter- 
mined from the whole-gene saturation mutagen- 
esis data. Results are shown as the mean and SD 
from two independent ampicillin selection experi- 
ments for all mutations (orange) or those obtain- 
able by single-nucleotide mutation from TEM-1 
(blue). 

(F) Results of growth experiments (n = 6) in which 
an error-prone PCR library of TEM-1 was sub- 
jected to pre-selection at several concentrations 
of ampicillin followed by cefotaxime selection 
(Extended Experimental Procedures). In (E) and (F), 
asterisks denote that results at 2,500 |.ig/ml ampi- 
cillin are significantly different from those at all 
other ampicillin concentrations (one-way ANOVA 
and post-hoc Tukey test, p < 0.05). 

See also Figure S5. 



some surface positions (e.g., E104, T195, 
El 97, E240) still maintain the capacity for 
initiation of cefotaxime resistance while re- 
maining neutral over the full range of ampi- 
cillin concentrations examined here. 

DISCUSSION 



core (Figure 7D). The physical interpretation of such “pathways” 
of amino acid connectivity remain to be established but suggest 
the possibility that these represent anisotropic collective modes 
in the protein structure that functionally connect a small set of sur- 
face sites to the protein active site (Lee et al., 2008; Reynolds 
et al., 2011). This architecture makes it so that about 50% of 
cefotaxime-adaptive single mutations (53/1 06 total) occur at sur- 
face positions far from the active site. The effect of selection un- 
der high levels of ampicillin is to reduce the set of mutations avail- 
able for conferring cefotaxime resistance by reducing the 
likelihood of those that occur within the protein core and near 
to the active site (Figures 6D, B, 7C, 7E, and 7F). Nevertheless, 
the distributed architecture of adaptive mutations is such that 



In summary, these results show that both 
the robustness and evolvability of TEM-1 
are not invariant properties of the protein but instead are depen- 
dent on the strength of purifying selective pressure applied. At 
low doses of ampicillin, the high activity of TEM-1 and the non- 
linear dependence of fitness on enzyme activity render many mu- 
tations statistically neutral in their fitness effects under moderate 
selection conditions. Under these conditions, we find that the 
constraints on fitness are loaded in a small, physically connected 
network of residues in the TEM-1 tertiary structure— a functional 
core— that is built around and extending from the active site. 
Interestingly, a small fraction of positions showing neutral varia- 
tion display the capacity to confer increased resistance to a 
new p-lactam drug, cefotaxime, upon single amino acid muta- 
tion. These positions comprise contiguous amino acid networks 
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Figure 7. Structural Relationship of Robustness and Evolvability 

(A-C) Shown are histograms of the average fitness effects of mutations at each position (F?)^ under selection at (A) 39 ng/nl and (B and C) 2,500 ng/mi ampiciiiin 
(see Figure S2). Red iines are fits to a doubie Gaussian function. The dashed iine indicates a (Ff )g cutoff for positions with significant mutationai sensitivity (see 
Figure S2), whereas coiored bars indicate the range of (F?)^ for positions shown on structures in paneis to the right (note the lower range In C). 

(D-F) The spatial pattern of adaptability toward cefotaxime (dark gray spheres) in the context of mutational sensitivity under ampiciiiin selection at (D) 39 ng/ml 
(red spheres) or (E and F) 2,500 rig/ml (blue or white/blue spheres); shown are slices through the core of the TEM-1 protein, and the surface of TEM-1 is in mesh. 
The number of adaptive mutations per position is in parentheses (note that not all positions conferring cefotaxime resistance are labeled). The data show that 
adaptation to cefotaxime resistance in TEM-1 arises from different physically contiguous networks of residues that connect specific distal sites to the core 
catalytic and structural residues defined by mutational sensitivity at low levels of ampiciiiin selection (red, D). At high levels of ampiciiiin (B and E), most core 
positions display a significant fitness cost upon mutation, reducing the number of cefotaxime-adaptive but ampicillin-neutral mutations. The pattern of mutational 
sensitivity remains heterogeneous even at high ampiciiiin levels (C and E), with positions showing the largest fitness cost similar to those with a significant fitness 
cost at low ampiciiiin (compare D and F). 



that extend from the functional core to connect the active site to a 
number of distantly positioned surface positions at which cefo- 
taxime resistance can be acquired. The robustness of TEM-1 , 
however, collapses at high levels of ampiciiiin, as more and 
more mutations are drawn into the regime where they affect 
organismal fitness. With the loss of robustness, the capacity for 
adaptation to cefotaxime is also much reduced as many adaptive 
mutations that were neutral at low ampiciiiin concentration now 
have a significant impact on fitness. It is important to note that 
these data address only the initial single mutation step toward 



the evolution of new function and with only two substrates; it 
will be interesting to examine how paths of higher-order muta- 
tions and robustness and evolvabilty under different p-lactam se- 
lection regimes are related to the functional and adaptive archi- 
tecture of TEM-1 described by the global saturation 
mutagenesis data presented here. 

The key property underlying both robustness and evolvability 
of TEM-1 is the high activity of the enzyme in vivo— the enzyme 
sits far along the saturated region in the activity-fitness relation- 
ship (Figure 4). In much the same way that an excess of 
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thermodynamic stability has been associated with enhanced 
protein robustness and evolvability (Bloom et al., 2006; Bloom 
et al., 2005; Tokuriki and Tawfik, 2009), our results here make 
the more general point that an excess of activity in vivo (from 
high catalytic activity or in vivo enzyme concentration, or both) 
may underlie both of these properties. 

Importantly, these findings suggest new testable hypotheses 
for how enzymes like TEM-1 could be selected to have such 
high enzymatic activity. TEM-1 is an example of a so-called per- 
fect enzyme, catalyzing the hydrolysis of penicillin-class antibi- 
otics near the diffusion-controlled limit (Matagne et al., 1998). 
Why should it be so highly active if organismal fitness can satu- 
rate at much lower levels of activity? One obvious hypothesis is 
simply that TEM-1 may have evolved a high activity phenotype 
because of direct selection under periods of high concentrations 
of penicillin class p-lactam compounds (e.g., ampicillin) or under 
conditions of high mutation rate (Wilke etal., 2001). However, the 
finding that cefotaxime resistance in TEM-1 predominantly 
emerges from mutations that are neutral under ampicillin selec- 
tion at low to moderate doses suggests that high activity could 
also result indirectly due to selection for evolvability. In this sce- 
nario, ancestral TEM-1 also randomly encountered other non- 
optimal p-lactams (e.g., cephalosporins). Such conditions would 
favor evolvable variants, those by mutation capable of conferring 
resistance to non-optimal p-lactams while still maintaining high 
fitness in the presence of ampicillin. Given that the data show 
that high in vivo enzyme activity underlies robustness, and 
robustness in turn promotes new activities through harboring 
conditionally neutral mutations, it thus follows that an enzyme 
with high activity would be evolutionarily favored under fluctua- 
tions in the distribution of p-lactam substrates. Thus, mecha- 
nisms that promote evolvability could be selected as a result of 
their success under historical environmental fluctuations (Kirsch- 
ner and Gerhart, 1998, 2005). Future work will experimentally 
address the notion that the history of environmental fluctuations 
fundamentally defines the robustness and evolvability of natural 
proteins. 

EXPERIMENTAL PROCEDURES 

Whole-Gene Saturation Mutagenesis Library Construction 

A comprehensive whole-gene saturation mutagenesis library was constructed 
by an overlap extension PCR mutagenesis technique (Higuchi et al., 1988; 
McLaughlin et al., 2012). To permit full coverage of the b/ajEM-i coding region 
(789 bp) with 80 base paired-end reads by lllumina sequencing, we split the 
sequence into ten subgroups (amino acid positions 26-51, 52-78, 79-104, 
105-132, 133-156, 157-183, 184-209, 210-236, 237-264, and 265-290). 
The mutagenesis PCR products for the positions of each subgroup were 
mixed in equimolar ratios and ligated as a single library. Each subgroup was 
independently subject to selection in antibiotic and sample preparation for 
lllumina sequencing. A detailed description of the cloning procedure is pro- 
vided in the Extended Experimental Procedures. 

Antibiotic Selection 

All selection experiments for the whole-gene saturation mutagenesis library 
were performed in E. coli MegaX DH10B T1 (Invitrogen). Selection was per- 
formed in 96-well deep-well plate format at 37°C in Luria-Bertani broth (Fisher 
Scientific) containing 12 iig/ml tetracycline hydrochloride (Sigma). Wells con- 
tained either 0, 10, 39, 156, 625, or 2,500 ^g/ml of ampicillin or 0.15 |.ig/ml ce- 
fotaxime at 25-fold concentration. The duration of growth (~2 hr) was chosen 
to obtain significant selection while maintaining sufficient population size 



(~10® cells) relative to the library diversity and avoiding stationary growth. De- 
tails are provided in the Extended Experimental Procedures. 

lllumina Sequencing 

Samples for lllumina sequencing were prepared by PCR from libraries after 
antibiotic selection as previously described (McLaughlin et al., 201 2). Addition 
of adaptor sequences for lllumina sequencing was performed in two rounds: 
the first round amplifies the mutated region of TEM-1, adds the annealing 
site for the lllumina paired-end sequencing primer, and incorporates a 4 bp 
barcode to indicate the concentration of antibiotic. The second round adds 
the remainder of the sequencing primer annealing site along with the annealing 
site for the lllumina flow cell. Primer sequences are available upon request, 
lllumina sequencing and determination of allele counts were performed as pre- 
viously described (McLaughlin et al., 2012). Sequencing was performed at the 
UT Southwestern Genomics Core on an lllumina Genome Analyzer llx using a 
version 4 paired-end PE-75 flow cell. Sequences from the lllumina base-caller 
were imported into CLC Genomics Workbench and trimmed for size and 
quality using a cutoff of 0.01 for the modified Mott algorithm. Custom scripts 
written in MATLAB were used to count the number of each allele under each 
selection condition and to determine relative fitness values. MATLAB scripts 
are available upon request. 

Mechanistic Model 

A detailed description of the model describing relative fitness as a function 
of intracellular p-lactamase activity and ampicillin concentration is provided 
in the Extended Experimental Procedures, p-lactam antibiotics inhibit bacterial 
growth through competitive inhibition of DD-carboxypeptidases and other 
PBPs involved in synthesis of the peptidoglycan layer of bacterial cell walls. 
In the model, relative fitness (Ff ) is proportional to the difference in PBP activity 
in the presence of mutant versus wild-type TEM-1 . PBP activity is described 
according to Michaelis-Menten kinetics, modified to include competitive 
inhibition by ampicillin. The periplasmic concentration of ampicillin is deter- 
mined by the equilibrium between the flux of antibiotic across the outer mem- 
brane and its rate of hydrolysis (inactivation) by p-lactamase (Zimmermann 
and Rosselet, 1977). We determined values for the free parameters of our 
model (total of 9,998 free parameters for 29,849 data points) by fitting the 
experimental data obtained for all 4,997 mutations using a Monte Carlo simu- 
lated annealing (MCSA) procedure; the model fits the data well with an overall 
1^ = 0. 9767. 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures, five 
figures, and four tables and can be found with this article online at http://dx. 
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SUMMARY 

The mechanisms by which neutralizing antibodies 
inhibit Marburg virus (MARV) are not known. We iso- 
lated a panel of neutralizing antibodies from a human 
MARV survivor that bind to MARV glycoprotein (GP) 
and compete for binding to a single major antigenic 
site. Remarkably, several of the antibodies also 
bind to Ebola virus (EBOV) GP. Single-particle EM 
structures of antibody-GP complexes reveal that all 
of the neutralizing antibodies bind to MARV GP at 
or near the predicted region of the receptor-binding 
site. The presence of the glycan cap or mucin-like 
domain blocks binding of neutralizing antibodies to 
EBOV GP, but not to MARV GP. The data suggest 
that MARV-neutralizing antibodies inhibit virus by 
binding to infectious virions at the exposed MARV re- 
ceptor-binding site, revealing a mechanism of filovi- 
rus inhibition. 

INTRODUCTION 

Marburg virus (MARV) and Ebola virus (EBOV), which are mem- 
bers of the family Filoviridae, infect humans and non-human pri- 
mates, causing a hemorrhagic fever with mortality rates up to 
90% (Brauburger et al., 2012). There have been a dozen out- 
breaks of Marburg virus infection in humans reported to date, 
including the most recent report from Uganda of a 30-year-old 
male health worker who died in September 2014 (WHO, 
2014a). As of January 7, 2015, there have been in excess of 
20,000 confirmed, probable, and suspected cases of Ebola virus 
disease (EVD) in the current EBOV outbreak in nine affected 
countries (Guinea, Liberia, Mali, Nigeria, Senegal, Sierra Leone, 

CrossMark 



Spain, the United Kingdom, and the United States of America), 
with more than 8,000 deaths (WHO, 2014b). 

There is no licensed treatment or vaccine for filovirus infection. 
Recently, several studies showed that filovirus glycoprotein 
(GP)-specific neutralizing antibodies (nAbs) can reduce mortality 
following experimental inoculation of animals with a lethal dose 
of EBOV (Dye et al., 2012; Marzi et al., 2012; Olinger et al., 
2012; Qiu et al., 2012, 2014; Pettitt et al., 2013) or MARV (Dye 
et al., 2012). The primary target of these nAbs, the filovirus sur- 
face GP, is a trimer composed of three heavily glycosylated 
GP1-GP2 heterodimers (Figure SI). The GP1 subunit can be 
divided further into base, head, glycan cap, and mucin-like do- 
mains (Lee et al., 2008). During viral entry, the mucin-like domain 
and glycan cap mediate binding to multiple host attachment fac- 
tors present on the cell membrane. After the virus enters the host 
cell by macropinocytosis (Nanbo et al., 2010; Saeed et al., 2010), 
the GP is cleaved by host proteases that remove approximately 
80% of the mass of the GP1 subunit, including the mucin-like 
domain and glycan cap (Chandran et al., 2005; Dube et al., 
2009). After cleavage of GP in the endosome, the receptor-bind- 
ing sites on GP become exposed, and the GP1 head then is able 
to bind to its receptor, Niemann-Pick Cl (NPC1) protein (Carette 
et al., 201 1 ; Chandran et al., 2005; Cote et al., 201 1). Subsequent 
conformational changes in GP facilitate fusion between viral and 
endosomal membranes. 

The dense clustering of glycans on the glycan cap and mucin- 
like domain likely shield much of the surface of EBOV GP from 
humoral immune surveillance, leaving only a few sites on the 
EBOV GP protein at which nAbs could bind without interference 
by glycans (Cook and Lee, 2013). Most of our knowledge about 
humoral response against filovirus infections has come from 
studies of murine Abs that recognize EBOV GP. From those 
studies, we learned that mouse nAbs preferentially target pep- 
tides exposed in upper, heavily glycosylated domains or lower 
areas (the GP1 base), where rearrangements occur that drive 
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fusion of viral and host membranes (Saphire, 201 3). Abs have not 
been identified that target protein features of the GP1 head sub- 
domain, where the receptor-binding site to NPC1 protein is 
located. Ab KZ52, the only reported human EBOV GP-specific 
mAb, was obtained from a phage display library that was con- 
structed from bone marrow RNA obtained from a survivor 
(Maruyama et al., 1999). KZ52 binds a site at the base of the 
GP and neutralizes EBOV, most likely by inhibiting the conforma- 
tional changes required for fusion of viral and endosomal mem- 
branes (Lee et al., 2008). Some murine Abs also have been 
reported to bind to the base region of Ebola virus GPs (Dias 
et al., 2011, Murin et al., 2014). In contrast, very little is known 
about the mechanisms by which Abs neutralize MARV. Two mu- 
rine Abs that bound the mucin-like domain of MARV GP reduced 
MARV budding from infected cells in culture but failed to 
neutralize virus directly (Kajihara et al., 2012). Polyclonal 
MARV-specific Abs were shown to protect non-human primates 
when administrated passively after challenge (Dye et al., 2012). 
The epitopes recognized by such polyclonal nAbs, and the 
mechanism of neutralization by which these Abs act, are un- 
known. In this study, we isolated a large panel of human nAbs 
from B cells of a human survivor of severe MARV infection and 
used these Abs to define the molecular basis of MARV neutrali- 
zation by human Abs. The results show that MARV nAbs recog- 
nize the NPC1 receptor-binding domain of MARV GP and, in 
some cases, also recognize conserved structural features in 
the equivalent receptor-binding domain on EBOV GP. 

RESULTS 

Isolation of Monoclonal Antibodies 

We tested plasma of a MARV survivor previously infected in 
Uganda for the 50% neutralization activity against the Uganda 
strain of MARV and found a serum-neutralizing titer of 1:1,010. 
To generate human hybridoma cell lines secreting mAbs to 
MARV, we screened supernatants from EBV-transformed B 
cell lines derived from the survivor for binding to several recom- 
binant forms of MARV GP or to irradiated cell lysates prepared 
from MARV-infected cell cultures. We fused transformed cells 
from B cell lines producing MARV-reactive Abs to the MARV 
antigens with myeloma cells and generated 51 cloned hybrid- 
omas secreting MARV-specific human mAbs. Thirty-nine of 
these mAbs were specific to the MARV GP, while 1 2 bound to in- 
fected-cell lysate, but not to GP; these latter mAbs were shown 
in secondary screens to bind to MARV internal proteins (NP, 
VP35, or VP40; data not shown). Analysis of the Ab heavy- and 
light-chain variable domain sequences revealed that all MARV- 
specific mAbs were encoded by unique Ab genes. 

Neutralization Activity 

To evaluate the inhibitory activity of the mAbs, we first performed 
in vitro neutralization studies using a chimeric vesicular stomati- 
tis virus with MARV GP from Uganda strain on its surface (vesic- 
ular stomatitis virus/Marburg glycoprotein recombinant VSV/ 
GP-Uganda). Eighteen of the 39 MARV GP-specific mAbs ex- 
hibited neutralization activity against VSV/GP-Uganda (Figures 
"A and 1C; Figures S2 and S4). Of those 18 nAbs, 9 displayed 
strong (ICso < 10 pg/ml), 8 nAbs displayed moderate (IC 50 : IQ- 



99 pg/ml), and one displayed weak (IC 50 : 100-1,000 ng/ml) 
neutralizing activity against VSV/GP-Uganda. We also tested 
the neutralization potency of all nAbs that bound to MARV GP 
in a plaque reduction assay using live MARV-Uganda virus. Of 
18 Abs that neutralized VSV/GP-Uganda, 11 Abs exhibited 
neutralizing activity against MARV-Uganda (Figures 1A and 10; 
Figures S3 and S4). These data suggest that VSV/GP, often 
used to study neutralizing potency of Abs because of its BSL-2 
containment level, is more susceptible to Ab-mediated neutrali- 
zation than live MARV. This difference is likely explained by the 
significantly lower copy number of MARV GP molecules that 
incorporate into VSV particles compared with the large number 
of GP molecules on the surface of filovirus filaments (Beniac 
et al., 2012; Thomas et al., 1985). Comparison of MARV-neutral- 
izing and non-neutralizing antibodies at concentration up to 
1.6 mg/ml revealed dose-dependent activity of those mAbs 
that neutralized. The neutralization activity of nAbs was not 
enhanced by the presence of complement (data not shown). 
As expected, we did not detect neutralizing activity for any of 
the 12 Abs specific to MARV NP, VP35, or VP40 proteins. 

Recognition of Varying Forms of GP 

To characterize the binding of isolated Abs to recombinant 
MARV GPs, we performed binding assays using either a recom- 
binant MARV GP ectodomain containing the mucin-like domain 
(MARV GP) or a recombinant GP lacking residues 257-425 of the 
mucin-like domain (MARV GPAmuc). Based on OD 405 values at 
the highest Ab concentration tested (E^ax) and 50% effective 
concentration (EC 50 ), we divided the MARV-GP-specific Abs 
into four major groups, based on binding phenotype (designated 
binding groups 1, 2, 3A, and 3B; Figures 1B and S5). Binding 
group 1 mAbs had an E^ax to GP <2 (i.e., these mAbs never ex- 
hibited a maximal binding level to MARV GP); binding group 2 
mAbs had an E^axtoGP >2, with ECsofor GP <ECso for GPAmuc 
(i.e., these mAbs bound to the mucin-like domain or glycan cap); 
and binding group 3 had an Emax to GP >2, with EC 50 for 
GP ^ECso for GPAmuc (i.e., these mAbs bound equally well to 
full-length and mucin-deleted forms of GP), with the group 3A 
mAbs having an EC 50 for GP <0.5 |.i.g/ml and the group 3B 
mAbs having an EC 50 for GP >0.5 ng/ml (suggesting that, as a 
class, the group 3B mAbs possess a lower steady-state Kq of 
binding to GP than did group 3A mAbs). 

Abs that lacked neutralization activity against VSV/GP- 
Uganda or MARV-Uganda fell principally into binding groups 1, 
2, and 3A. Interestingly, all VSV/GP-Uganda nAbs displayed a 
unique binding pattern and segregated into binding group 3B 
(Figure 1 C). It was interesting that while both mAbs from groups 
3A and 3B bound equally well to the full-length MARV GP and to 
the GPAmuc, EC 50 values for nAbs from binding group 3B were 
higher than those for non-neutralizing Abs from group 3A. 

Competition-Binding Studies 

To determine whether mAbs from distinct binding groups tar- 
geted different antigenic regions on the MARV GP surface, we 
performed a competition-binding assay using a real-time 
biosensor. We tested 18 MARV nAbs from binding group 3B, 4 
Abs from binding group 3A, and 1 Ab from binding group 2 in a 
tandem blocking assay in which biotinylated GPAmuc was 
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Figure 1. MARV-Neutralizing mAbs Display a Unique Binding Pattern and Target a Distinct Antigenic Region on the GP Surface 

(A) Neutralization activity of MR77 (non-neutralizing antibody) or MR213 (neutralizing antibody) against VSV/GP-Uganda (red circles) or MARV-Uganda (black 
circles). Error bars represent the SE of the experiment performed in triplicate. 

(B) Binding of representative mAbs from four distinct binding groups to the MARV GP (blue squares) or MARV GPAmuc (green squares). A dotted line indicates 
0.5 i-ig/ml threshold for categorizing group 3 antibodies as possessing low (3A) or high (3B) EC 50 values. 

(C) Heatmap showing the neutralization potency of MARV GP-specific mAbs against VSV/GP-Uganda or MARV-Uganda. The IC 50 value for each virus-mAb 
combination is shown, with dark red, orange, yellow, or white shading indicating high, intermediate, low, or no potency, respectively. IC 50 values greater than 
1 ,000 i-ig/ml are indicated by >. Neutralization assays were performed in triplicate. 

(D) Data from competition binding assays using mAbs from binding groups 2, 3A, or 3B. Numbers indicate the percent binding of the competing mAb in the 
presence of the first mAb, compared to binding of competing mAb alone. MAbs were judged to compete for the same site if maximum binding of the competing 
mAb was reduced to <30% of its un-competed binding (black boxes with white numbers). MAbs were considered non-competing if maximum binding of the 
competing mAb was >70% of its un-competed binding (white boxes with red numbers). Gray boxes with black numbers indicate an intermediate phenotype 
(between 30 and 70% of un-competed binding). 

See also Figures S2, S3, S4, and S5. 



attached to a streptavidin biosensor. Abs from group 1 and the 
two non-neutralizing Abs from binding group 3B did not bind 
to biotinylated GPAmuc in the competition assay and were 
excluded from the analysis. While non-neutralizing Abs from 



binding groups 2 and 3A did not prevent binding of the binding 
group 3B nAbs to GPAmuc, all nAbs blocked binding of each 
of the other nAbs to the antigen and segregated into a single 
competition-binding group (Figure 1D). These data suggested 
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Figure 2. Neutralizing Antibodies from a Human Survivor of MARV Bind to the Receptor-Binding Site of GP at Two Distinct Angles of 
Approach 

(A) Representative reference-free 2D class averages of the MARV GPAMuciMR Fab complexes. 

(B) EM reconstructions of seven Fab fragments of neutralizing antibodies bound to MARV GPAmuc (side views). All seven antibodies target a similar epitope on 
the top of GP. 

(C) These antibodies can be subdivided based on their angles of approach: (1) those that bind toward the top and side of GP1 at a shallow angle relative to the 
central 3-fold axis (MR72 in red, MR78 in orange, MR201 in yellow, or MR82 in green) and (2) those that bind at a steeper angle toward the top of GP1 (MR1 91 in 
cyan, MR111 in blue, orMR198 in purple). 

(D) The crystal structure of EBOV GPAmuc (GP1 in white and GP2 in dark gray) is modeled into the MARV GP density (mesh), and the angles of approach of the 
neutralizing antibodies are indicated with arrows, colored as in (B). The footprint of the antibodies is indicated by a black circle targeting residues in the putative 
receptor-binding site (RBS) through a variety of approach angles. 

See also Figure S1 . 



that all of the nAbs target a single major antigenic region on the 
MARV GP surface. 

Electron Microscopy Studies of Antigen-Antibody 
Complexes 

To determine the location of the antigenic region targeted by 
MARV nAbs, we performed negative stain single-particle elec- 
tron microscopy (EM) studies using complexes of GPAmuc 
with Fab fragments of seven nAbs from Binding Group 3B. The 
EM reconstructions clearly showed that Fab fragments for all 
seven nAbs bind at the top of the GP in or near the NPC1 protein 
receptor-binding site (Figures 2A and 2B). The binding pattern of 
these Abs could be divided further into two major groups based 
on their relative angle of approach to the GP head domain. MAbs 
MR72, MR78, MR201 , and MR82 bound toward the top and side 
of GP1 at a shallow angle relative to the central 3-fold axis, while 



mAbs MR191, MR111, and MR198 bound at a steeper angle 
toward the top of GP1 (Figures 2C and 2D). When we compared 
ICso values for nAbs that bound in the two binding poses, we did 
not detect a significant difference in neutralization potency 
based on the angle of approach (Figure 1C). 

Antibody Neutralization Escape Mutant Viruses 

As an additional strategy to determine residues on MARV GP 
involved in binding to nAbs, we generated VSV/GP-Uganda 
variant viruses that escaped neutralization, and then we deter- 
mined the sequence of the GP of those mAb escape viruses. 
Vero E6 cells were inoculated with VSV/GP-Uganda in the pres- 
ence of MR72 or MR78 nAbs. Two escape mutant viruses were 
isolated: virus variant VSV/GP-72.5 contained three missense 
mutations in the MARV GP gene (N1 29S in the putative NPC1 re- 
ceptor-binding site, S220P in the glycan cap and P455L in the 
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mucin-like domain), and virus variant VSV/GP-78.1 possessed 
missense mutation C226Y in the glycan cap (Figure 3A). Consis- 
tent with the EM data, six out of seven nAbs tested displayed a 
higher level of neutralization activity against the wild-type VSV/ 
GP-Uganda than to the VSV/GP-72.5 or VSV/GP-78.1 escape 
mutant viruses, suggesting these nAbs recognize MARV GP in 
a similar fashion (Figure 3B). MAb MR198 exhibited equal 
neutralization potency against wild-type VSV/GP-Uganda or 
the two escape mutant viruses (Figure 3B). As all nAbs segre- 
gated into one competition group (Figure ID), bound the 
MARV GP at the NPC1 receptor-binding site (Figures 2A-2D), 
and displayed a similar profile of neutralization of escape mutant 
viruses (Figure 3B), we propose that blocking of MARV GP bind- 
ing to NPC1 is the principal mechanism of MARV neutralization 
by these naturally occurring human Abs. This model is supported 
by the data in the accompanying paper by Hashiguchi et al. 
(2015; this issue of Cell) showing that MR78 inhibits binding of 
NPC1 domain C to MARV GP. 

Cross-Reactive Binding of MARV Antibodies with 
EBOV GP 

It is surprising that human MARV nAbs recognize the putative 
NPC1 protein receptor-binding site on GP, since previous 
studies suggested that the NPC1 protein receptor-binding site 
on EBOV GP may be obscured from Ab binding by the presence 
of the highly glycosylated glycan cap and mucin-like domain 



Figure 3. Generation of Escape Mutants for 
MARV-Neutralizing Antibodies 

(A) VSV-MARV-72.5 (dotted lines) or VSV-MARV- 
78.1 (dashed line) escape mutations mapped onto 
the domain schematic of MARV GP. RBS, receptor 
binding site; GLC, glycan cap; MUC, mucin-like 
domain. 

(B) Neutralization activity of antibodies from bind- 
ing group 3B against wild-type VSV/GP-Uganda 
(circles, straight curves), VSV/GP-72.5 (squares, 
dotted curves), or VSV/GP-78.1 (triangles, dashed 
curves) escape mutant viruses. 



(Lee et al., 2008). To determine whether 
the MARV nAbs we isolated also could 
bind in a cross-reactive manner to the 
EBOV GP receptor-binding site, we per- 
formed ELISA using three recombinant 
forms of MARV and EBOV GPs: full-length 
GP ectodomain containing the glycan 
cap and mucin-like domain (designated 
MARV or EBOV GP), ectodomains lacking 
residues 257-425 (MARV) or 314-462 
(EBOV) of the mucin-like domain (desig- 
nated MARV or EBOV GPAmuc), and 
cleaved GP ectodomains enzymatically 
treated to remove the mucin-like domain 
and glycan cap (designated MARV or 
EBOV GPcI). Three of the MARV nAbs, 
designated MR78, MR111, and MR191, 
recognized the EBOV GPcI that lacked 
the glycan cap and mucin-like domain (Figure 4A). Remarkably, 
the MARV nAb MR72 bound all three forms of both EBOV and 
MARV GPs with similar EGso and E^ax values, indicating that 
its epitope, and the EBOV receptor-binding site, which it likely 
overlaps, might be partially accessible for Ab binding even in 
the full-length form (Figure 4A). We tested the breadth of neutral- 
ization of MARV nAbs for filoviruses using a panel of different 
MARV and EBOV isolates. While multiple MARV Abs displayed 
neutralizing activity toward different MARV strains, MARV nAbs 
did not exhibit detectable neutralization activity against EBOV 
or VSV/EBOV (Figure 4B). Structural analysis of MARV and 
EBOV GP in the accompanying paper by Flashiguchi et al. 
(2015) reveals that the glycan cap and mucin-like domain likely 
obscure the receptor-binding domain in EBOV, but not in MARV. 

In Vivo Testing 

We tested the in vivo protective activity of the mAbs in a murine 
model using mouse-adapted MARV strain Oi67 (Warfield et al., 
2007, 2009). Inoculation of mice with MARV Oi67 causes clinical 
disease and, in a proportion of animals, causes lethal disease, 
although typically less than 100% lethality in mice (Warren 
et al., 2014). We selected four of the mAbs among those with 
the lowest in vitro neutralization IC 50 values: MR72, MR82, 
MR213, and MR232. The IC50 values in neutralization assays 
with MARV Uganda or mouse-adapted MARV strain Ci67 were 
comparable (within 2-fold). Seven-week-old BALB/c mice were 



Cell 160, 893-903, February 26, 2015 ©2015 Elsevier Inc. 897 





Cell 



A Binding (pg/mL) 



mAb 


MARV 


EBOV I 


GP 


GPAmuc 


GPcI 


GP 


GPAmuc 


GPcI 


MR65 


8.3 


7.5 


5.0 


> 


> 


> 


MR72 


3.0 


4.7 


0.8 


6.1 


2.1 


<0.1 


MR78 


1.4 


2.3 


1.1 


> 


> 


107.4 


MR82 


1.0 


1.5 


0.5 


> 


> 


> 


MR103 


8.8 




4.8 


> 


> 


> 


MR111 


2.5 


4.3 


1.5 


> 


> 


21.5 


MR144 


8.1 


8.0 


3.3 


> 


> 


> 


MR186 


1.3 


0.9 


0.5 


> 


> 


> 


MR191 


2.5 


5.1 


1.4 


> 


> 


<0.1 


MR198 


1.4 


1.4 


0.8 


> 


> 


> 


MR201 


1.5 


1.9 


0.5 


> 


> 


> 


MR208 


5.6 


7.3 


2.8 


> 


> 


> 


MR209 


4.0 


5.4 


2.0 


> 


> 


> 


MR213 


2.8 


3.6 


1.1 


> 


> 


> 


MR229 


1.8 


2.9 


1.2 


> 


> 


> 


MR232 


2.0 


1.3 


0.5 


> 


> 


> 


MR238 


6.8 




4.9 


> 


> 


> 


MR241 


2.2 


4.0 


1.2 


> 


> 


> 



B Neutraiization (|jg/mL) 



mAb 


I MARV I 


1 EBOV 1 


VSV/GP- 

Musoke 


VSV/GP- 

Uganda 


MARV- 

Musoke 


MARV- 

Uganda 


MARV- 

Angola 


MARV- 

Ravn 


VSV/GP- 

EBOV 


EBOV 


MR65 


31.0 


224 


> 


> 


214 


> 


> 


> 


MR72 


3.6 


13.4 


> 


601 


> 


368 


> 


> 


MR78 


3.8 


4.5 


> 


93 


> 


286 


> 


> 


MR82 


1.8 


7.4 


234 


288 


184 


185 


> 


> 


MR103 


16.5 


1 1 


> 


291 


> 


> 


> 


> 


MR111 


12.2 


7.9 


370 


414 


> 


444 


> 


> 


MR144 


43.1 


[ 37.3 1 


900 


> 


> 


354 


> 


> 


MR186 


1.5 


1.5 


24 


> 


97 


64 


> 


> 


MR191 


5.5 


6.2 


441 


> 


413 


> 


> 


> 


MR198 


2.7 


1 ■'■'•6 1 


290 


206 


128 


30 


> 


> 


MR201 


6.6 


8.0 


343 


572 


358 


832 


> 


> 


MR208 


I ^3.8 I 


54.9 


896 


> 


> 


106 


> 


> 


MR209 


4.2 


12.2 


577 


402 


> 


93 


> 


> 


MR213 


7.6 


9.7 


> 


305 


207 


121 


> 


> 


MR229 


5.1 


7.3 


103 


215 


110 


59 


> 


> 


MR232 


3.9 


4.0 


> 


114 


103 


127 


> 


> 


MR238 


1 1 




264 


> 


416 


> 


> 


> 


MR241 


2.7 


1 


376 


> 


162 


> 


> 


> 



Figure 4. Breadth of Binding or Neutralization of Human MARV-Specific mAbs for Diverse Filoviruses 

(A) A heatmap showing the binding in ELiSA of neutraiizing mAbs from binding group 3B to the MARV and EBOV GPs. EC 50 vaiue for each antigen-mAb 
combination is shown, with dark red shading indicating lower EC 50 vaiues and orange or yeilow shading indicating intermediate or higher EC 50 vaiues. EC 50 vaiues 
greater than 1 ,000 rig/ml are indioated by >. 

(B) A heatmap showing the neutralization breadth of mAbs from binding group 3B. The iCso value for each virus-mAb combination is shown, with dark red 
shading indicating increased potency and orange or yeilow shading indicating intermediate or iow potency. iCso vaiues greater than 1,000 ng/mi are indicated 
by >. Neutralization assays were performed in triplicate. 



injected with 1 00 pg of antibody by the iP route and chaiienged 
with 1 ,000 piaque-forming unit (PFU) of Ci67. Twenty-four hours 
iater, antibody treatment was repeated. By day 6, aii five controi 
(untreated) mice deveioped progressive ioss of weight and 
symptoms of the disease, inciuding dyspnea, recumbency, 
and unresponsiveness, and on days 8 and 9, two animais were 
found dead and one animai was found moribund and euthanized. 
The remaining two animais demonstrated recovery by day 1 1 . In 
contrast, all animals treated with any antibody survived and did 
not display the elevation of the disease score, with the exception 
of two animals treated with MR72, which showed a transient 
marginal loss of weight and increase of the disease score on 
days 6-9, which did not exceed 1 (Figure 5). The observed level 
of protection was remarkable given the relatively modest in- 
vitro-neutralizing potency of the antibodies. 

DISCUSSION 

There is an obvious urgent need for prophylactic and therapeutic 
interventions for filovirus infections given the recurrence of 
MARV outbreaks, including that in Uganda in October 2014 
and a massive outbreak of EBOV infections in West Africa 
in 2014. There is very little information about the structural 
determinants of neutralization on which to base the rational 
selection of antibodies, and for MARV there have been no re- 
ported human nAbs. 

This study reveals that naturally occurring human MARV nAbs 
isolated from the B cells of a recovered donor principally target 
the MARV NP01 protein receptor-binding site, suggesting that 
a major mechanism of MARV neutralization could be inhibition 
of binding to receptor. Remarkably, some of the isolated anti- 



bodies also bound to the EBOV GP. This mechanism of MARV 
neutralization was unexpected, because previous studies with 
EBOV showed that the putative receptor-binding domain on 
GP is obscured on the surface of virions by the presence of the 
glycan cap and mucin-like domain, only becoming exposed 
following cleavage by cathepsin in the endosome. These studies 
suggest that the configuration of the MARV GP differs signifi- 
cantly from that of EBOV GP because the receptor-binding 
domain must be accessible for immune recognition on MARV 
GP. Indeed, determination of the structure of the MARV GP 
and structural analysis of the interaction of mAb MR78 with 
MARV and EBOV GP molecules shows this to be the case (see 
Flashiguchi et al., 2015). 

The information obtained from these studies can be used to 
inform development of new therapeutics and structure-based 
vaccine designs against filoviruses. Furthermore, as these 
nAbs are fully human and exhibit inhibitory activity, they might 
be useful as a component of a prophylactic or therapeutic 
approach for filovirus infection and disease. The challenge 
studies using a murine model here show clear evidence of in vivo 
activity and suggest additional preclinical studies in other spe- 
cies, such as guinea pigs and macaques, are warranted. Their 
ability to bind a broad range of MARV isolates indicates they 
may offer detection of or efficacy against new viral strains yet 
to emerge. Although some of these mAbs bind to certain forms 
of EBOV GP, these antibodies are not likely to be effective 
against natural Ebola infection because the EBOV receptor- 
binding site is obscured on the viral surface. However, such 
mAbs might neutralize EBOV if they could be delivered to the en- 
dosome, where the EBOV receptor-binding site is exposed 
following GP cleavage. 
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Figure 5. Survival and Clinical Overview of Mice Treated with MARV mAbs 

(A-C) Groups of mice at five animals per group were injected with individual mAbs by the intraperitoneal route twice: 1 hr prior and 24 hr after MARV challenge at 
100 )ig per treatment. Untreated animals served as controls. (A) Kaplan-Meier survival curves. (B) Body weight. (C) Illness score. 



EXPERIMENTAL PROCEDURES 
Donor 

The donor was an otherwise healthy adult woman who contracted Marburg 
virus (MARV) infection in 2008 following exposure to fruit bats in the Python 
Cave in Queen Elizabeth National Park, Uganda. The donor’s clinical course 
was documented previously (GDC, 2009). Peripheral blood from the donor 
was obtained in 2012, four years after the illness, following informed consent. 
The study was approved by the Vanderbilt University Institutional Review 
Board. 

Viruses 

MARV strain 200702854 Uganda (MARV-Uganda) was isolated originally from 
a subject designated “patient A” during the outbreak in Uganda in 2007 (CDC, 
2009; Towner et al., 2009) and underwent four passages in Vero E6 cells. 
MARV strain Musoke (MARV-Musoke) was isolated during the outbreak in 
Kenya in 1980 (Smith et al., 1982) and passaged five times in Vero E6 cells. 
MARV strain 200501379 Angola (MARV-Angola) was isolated during the 
outbreak in Angola in 2005 (Towner et al., 2006) and passaged three times 
in Vero E6 cells. MARV Ravn virus (Ravn) was isolated from a patient in 1987 
in Kenya (Johnson et al., 1996) and passaged four times in Vero E6 cells. All 
strains of MARV were obtained originally from the Special Pathogens Branch, 
U.S. Centers for Disease Control (CDC), and deposited at the World Reference 
Center of Emerging Viruses and Arboviruses (WRCEVA) housed at UTMB. The 
recombinant Ebola Zaire strain Mayinga (EBOV) expressing eGFP was gener- 
ated in our laboratory by reverse genetics (Lubaki et al., 2013; Towner et al., 
2005) from plasmids provided by the Special Pathogens Branch at CDC and 
passaged three times in Vero E6 cells. For analysis of antibody binding by 
ELISA, viruses were gamma-irradiated with the dose of 5 x 10® rad. The re- 
combinant VSV in which the VSV/GP protein was replaced with that of 
MARV strain Musoke (VSV/GP-Musoke) or EBOV strain Mayinga (Garbutt 
et al., 2004) were provided by Dr. Thomas Geisbert (UTMB) and Dr. Heinz 
Feldmann (NIH), respectively; a similar virus with GP from MARV (strain 



200702854 Uganda) was constructed as described below. All work with 
EBOV and MARV was performed within the Galveston National Laboratory 
BSL-4 laboratories. 

We used a mouse-adapted strain of MARV for testing the effect of mAbs 
in vivo. The mouse-adapted Ci67 strain of Marburg virus (Warfield et al., 
2007) was provided by Dr. Sina Bavari (U.S. Army Medical Research Institute 
of Infectious Diseases) and amplified by a single passage in Vero-E6 cells. 

Generation of a Chimeric Strain of VSV in which VSV G Protein Was 
Replaced with the GP Protein of MARV Strain Uganda 

The plasmid pVSV-XN2 carrying cDNA of the full-length VSV anti-genome 
sequence and the support plasmids pBS-N, pBS-L, and pBS-P encoding 
the internal VSV proteins under control of the T7 promoter were kindly pro- 
vided by Dr. John Rose (Yale University). The plasmid pC-T7, encoding the 
T7 polymerase, was kindly provided by Dr. Yoshihiro Kawaoka (University of 
Wisconsin). For generation of the VSV/GP-Uganda construct, Vero E6 cell 
monolayers were inoculated with MARV strain 200702854, and total cellular 
RNA was isolated and reverse transcribed. MARV GP open reading frame 
(ORF) was PCR amplified from cDNA using forward primer 5'-CATGTACG 
ACGCGT CAACATGAGGACTA-3^ and reverse primer 5'-TCTAGCAGCTC 
GAG CTATCCAATATATTTAGTAAAGATACGACAA-3' (Mlul and Xhol endonu- 
clease sites are underlined, respectively; the start and end of MARV GP ORF 
direct and complementary sequences are italicized, respectively). To replace 
VSV G with MARV GP, the resulting PCR product was cloned into pVSV- 
XN2 using the unique Mlul and Xhol endonuclease sites located between 
the VSV G gene-start and gene-end signals and flanking its ORF, resulting in 
the plasmid pVSV/GP-Uganda. To recover the recombinant virus, 1x10® 
BSR-T7 cells, kindly provided by Dr. Ursula Buchholz (U.S. National Institute 
of Allergy and Infectious Diseases), were transfected with the following plas- 
mids: pVSV/GP-Uganda, 5 jig, pBS-N, 1.5 jig, pBS-P, 2.5 )ig, pBS-L, 1 ).ig, 
and pC-T7, 5 |ig. After 48 hr, transfected BSR-T7 cells were collected with a 
cell scraper and transferred, along with the supernates, to Vero E6 cell mono- 
layers for amplification of the recovered VSV/GP-Uganda. 
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Generation of Human Hybridomas Secreting Monoclonal Antibodies 

Peripheral blood mononuclear cells (PBMCs) from the donor were isolated 
with Ficoll-Histopaque by density gradient centrifugation. The cells were cry- 
opreserved immediately and stored in the vapor phase of liquid nitrogen until 
use. Previously cryopreserved samples were thawed, and ten million PBMCs 
were plated into 384-well plates (Nunc #164688) using 17 ml of cell culture 
medium (ClonaCell-HY Medium A, StemCell Technologies, #03801), 8 iig/ml 
of the TLR agonist CpG (phosphorothioate-modified oligodeoxynucleotide 
ZOEZOEZZZZZOEEZOEZZZT, Invitrogen), 3 ^ig/ml of the Chk2 inhibitor 
(Sigma #C3742), 1 jig/ml of cyclosporine A (Sigma #C1 832), and 4.5 ml of clar- 
ified supernate from cultures of B95.8 cells (ATCC VR-1492) containing 
Epstein-Barr virus (EBV). After 7 days, cells from each 384-well culture plate 
were expanded into four 96-well culture plates (Falcon #353072) using cell cul- 
ture medium containing 8 |.ig/ml CpG, 3 |.ig/ml Chk2i, and ten million irradiated 
heterologous human PBMCs (Nashville Red Cross) and incubated for an addi- 
tional 4 days. Plates were screened for MARV antigen-specific antibody- 
secreting cell lines using ELISAs. Cells from wells with supernates reacting 
in a MARV antigen ELISA were fused with HMMA2.5 myeloma cells using an 
established electrofusion technique (Yu et al., 2008). After fusion, hybridomas 
were resuspended in medium containing 100 |.iM hypoxanthine, 0.4 )iM 
aminopterin, 16 ^iM thymidine (HAT Media Supplement, Sigma #H0262), 
and 7 |.ig/ml ouabain (Sigma #03125) and incubated for 18 days before 
screening hybridomas for antibody production by ELISA. 

Human mAb and Fab Production and Purification 

After fusion with HMMA2.5 myeloma cells, hybridomas producing MARV-spe- 
cific antibodies were cloned biologically by two rounds of limiting dilution and 
by single-cell fluorescence-activated cell sorting. After cloning, hybridomas 
were expanded in post-fusion medium (ClonaCell-HY Medium E, STEMCELL 
Technologies #03805) until 50% confluent in 75-cm^ flasks (Corning #430641). 
For antibody production, cells from one 75-cm^ flask were collected with a cell 
scraper and expanded to four 225-cm^ flasks (Corning #431 082) in serum-free 
medium (Hybridoma-SFM, GIBCO #12045-076). After 21 days, supernates 
were clarified by centrifugation and sterile filtered using 0.2-|im pore size filter 
devices. HiTrap Protein G or HiTrap MabSelectSure columns (GE Healthcare 
Life Sciences #17040501 and #11003494, respectively) were used to purify 
antibodies from filtered supernates. Fab fragments were generated by 
papain digestion (Pierce Fab Preparation Kit, Thermo Scientific #44985) and 
purified by chromatography using a two-column system in which the first 
column contained protein G resin (GE Healthcare Life Sciences #29048581) 
and the second column contained either anti-kappa or anti-lambda antibody 
light chain resins (GE Healthcare Life Sciences #17545811 and #17548211, 
respectively). 

Expression and Purification of MARV and EBOV GPs 

Angola strain MARV GP ectodomains, containing the mucin-like domain 
(MARV GP) or lacking residues 257-425 of the mucin-like domain (MARV 
GPAmuc), were used to screen supernates of transformed B cells and human 
hybridomas separately. Recombinant proteins for Ravn strain cleaved GP, 
EBOV Mayinga strain GP, EBOV Mayinga strain GPAmuc, and EBOV Mayinga 
strain cleaved GP were designed and expressed similarly. Large-scale pro- 
duction of recombinant GP or GPAmuc was performed by transfection of 
Drosophila Schneider 2 (S2) cells with modified pMTpuro vectors, followed 
by stable selection of transfected cells with 6 ).ig/ml puromycin. Secreted GP 
ectodomain expression was induced with 0.5 mM CUSO4 for 4 days. Proteins 
were engineered with a modified double strep tag at the C terminus (enteroki- 
nase cleavage site followed by a strep tag/linker/strep tag) to facilitate purifi- 
cation using Strep-Tactin resin (QIAGEN #2-1201). Proteins were purified 
further by Superdex 200 size-exclusion chromatography in 10 mM Tris and 
150 mM NaCI (pH 7.5) (1 x TBS). 

Lysates of MARV-Infected Cells 

Lysates were prepared as previously desoribed (Ksiazek et al., 1999). Briefly, 
Vero E6 cell monolayers in 850 cm^ roller bottles were inoculated with approx- 
imately 1 0® PFU MARV or EBOV and incubated at 37“C until partial destruction 
of monolayer occurred (approximately 9-10 days). Cell monolayers were de- 
tached using 3-mm glass beads, and cell suspensions were centrifuged at 



16,000 X g for 10 min at 4°C. Supernates were discarded; cell pellets were re- 
suspended in lOx excess of borate buffer saline (1 0 mM Na2B407 and 150 mM 
NaCI [pH 9.0]) and centrifuged at 16,000 x g for 10 min at 4^0. Supernates 
were discarded; cell pellets were resuspended in cold 1 % Triton X-1 00 (Fisher 
Scientific) in borate buffer saline, vortexed, and gamma-irradiated on dry ioe at 
5x10® rad. The lysates were sonicated with a 600 W Tekmar Sonic Disrupter 
TM600 (Tekmar) using a cuphorn sonioator at maximum power setting and 
50% duty cycle for 10 min and centrifuged at 16,000 x g, and the supernates 
were aliquoted. 

Screening ELISA 

ELISA plates were coated with lysates of MARV-infected oells (diluted 1 :1 ,000 
in Dulbecco’s PBS [DPBS]) or recombinant MARV GP or MARV GPAmuc pro- 
teins (20 |.ig in 10 ml DPBS per plate) and incubated at 4°C overnight. Plates 
were blocked with 100 |.il of blooking solution/well for 1 hr. Blocking solution 
consisted of 10 g powdered milk, 10 ml of goat serum, 100 ml of lOx 
DPBS, and 0.5 ml of Tween-20 mixed to a 1 I final volume with distilled water. 
The presenoe of antibodies bound to the GP was determined using goat anti- 
human immunoglobulin G (IgG) horseradish peroxidase-conjugated second- 
ary antibodies (Southern Biotech #2040-05, 1 :4,000 dilution) and 1 -Step Ultra 
TMB-ELISA substrate (Thermo Scientific #34029), with optical density read at 
450 nM after stopping the reaction with 1M HCI. 

Half-Maximal Effective Concentration Binding Analysis 

MARV or EBOV GPs, MARV or EBOV GPAmuc, or Ravn or EBOV cathepsin- 
cleaved GPs were coated onto 384-well plates (Thermo Scientific Nunc 
#265203) in DPBS at 2 {.ig/ml overnight, then antigen was removed, and plates 
were blooked with blocking solution made as above. Antibodies were applied 
to the plates at a concentration range of 1.5 |.ig/ml to 270 ng/ml (binding 
groups #1, #2, and 3A) and 0.1 |.ig/ml to 10 ng/ml (binding group #3B) using 
3-fold serial dilutions. The presence of antibodies bound to the GP was deter- 
mined using goat anti-human IgG alkaline phosphatase conjugate (Meridian 
Life Science #W99008A, 1:4,000 dilution) and p-nitrophenol phosphate sub- 
strate tablets (Sigma #S0942), with optical density read at 405 nM after 
120 min. A non-linear regression analysis was performed on the resulting 
curves using Prism (v. 5) (GraphPad) to calculate EC50 values. 

MARV and EBOV Neutralization Experiments 

Dilutions of mAbs in triplicate were mixed with 150 PFU of MARV or EBOV 
expressing eGFP in MEM containing 10% fetal bovine serum (FBS) (HyClone) 
and 50 jig/ml gentamicin (Cellgro #30-005-CR) with or without 5% guinea 
pig complement (MP Biomedicals #642836) in a total volume of 0.1 ml 
and incubated for 1 hr at 37°C for virus neutralization. Following neutralization, 
virus-antibody mixtures were placed on monolayers of Vero E6 cells in 24-well 
plates, incubated for 1 hr at 37°C for virus adsorption, and overlayed with MEM 
containing 2% FBS and 0.8% methylcellulose (Sigma-Aldrich #M0512). After 
incubation for 5 days, medium was removed, cells were fixed with 10% 
formalin (Fisher Scientifio #245-684), and plates were sealed in plastic bags 
and incubated for 24 hr at room temperature. Sealed plates were taken out 
of the BSL-4 laboratory according to approved SOPs, and monolayers were 
washed three times with PBS. Viral plaques were immunostained with the 
serum of rabbits that had been hyperimmunized with MARV, or with a mAb 
against EBOV, clone 15H10 (BEI Resources #NR-12184). Alternatively, 
following virus adsorption, monolayers were covered with MEM containing 
10% FBS and 1.6% tragacanth (Sigma-Aldrich #G1128). After incubation for 
1 4 days, medium was removed, cells were fixed with 1 0% formalin, and plates 
were sealed in plastic bags, incubated for 24 hr at room temperature, and 
taken out of the BSL-4 laboratory as above. Fixed monolayers were stained 
with 10% formalin containing 0.25% crystal violet (Fisher Scientific #C581- 
100), and plaques were counted. 

VSV-MARV and VSV-EBOV Neutralization Tests 

Neutralization assays were performed in triplicate, as described above for 
MARV and EBOV. Following neutralization, virus-antibody mixtures were 
placed on monolayers of Vero E6 cells in duplicate, incubated for 1 hr at 
37°C for virus adsorption, and overlayed with MEM containing 2% FBS con- 
taining 0.9% methyloellulose. After incubation for 3 days, medium was 
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removed, monolayers were fixed and stained with 10% formalin containing 
0.25% crystal violet, and plaques were counted. 

Generation and Sequencing of VSV/GP-Uganda Escape Mutants 

Vero E6 cell monolayers with 2-fold dilutions of mAbs (12.5-200 iig/ml) added 
to the medium were inoculated with 200 PFU of recombinant VSV/GP-Uganda 
and incubated at 37°C for 2-4 days. To determine which samples contained 
live virus, supernates were collected, virus was titrated in Vero E6 cell mono- 
layers under methylcellulose overlay, monolayers were incubated at 37°C for 
3-4 days, and plaques were counted. Supernates with the highest concentra- 
tions of mAbs, which were found to contain live virus by plaque titration, were 
incubated in presence of serially diluted mAbs, followed by titration of virus as 
above. The procedure was performed a total of three times. Escape mutant 
viruses harvested after the third passage were cloned biologically by plaque 
purification. For biological cloning, Vero E6 cell monolayers in 24-well plates 
were inoculated with dilutions of the escape mutant viruses in the presence 
of the corresponding mAbs (200 |.ig/ml of MR72 or 100 tig/ml of MR78) and 
covered with 0.7% low melting temperature SeaPlaque agarose (Lonza 
#50100). Monolayers were incubated at 37°C for 6 days; plaques were visual- 
ized with 0.01 % neutral red aqueous solution (Electron Microscopy Sciences), 
picked, resuspended in medium, and transferred to Vero E6 cell monolayers in 
24-well plates in the presence of the corresponding mAbs (200 |ig/ml of MR72 
or 1 00 |.ig/ml of MR78) for virus propagation. In 2-5 days, based on the extent 
of CPE observed, virus was harvested, and cells were dissolved in Trizol re- 
agent (Life Technologies 315596018). Total cellular RNA was extracted, 
reverse transcribed, and amplified by PCR with the primers described above 
for generation of a chimeric strain of VSV. Two overlapping fragments 
covering MARV GP ORF were PCR amplified from cDNA using forward primer 
5'-CATGTACGACGCGTCAACATGAGGACTA-3' and reverse primer 5'-ACT 
AAGCCCTGCTGCCAGGT-3' or forward primer 5'-ACAACAATGTACCGAGG 
CAA-3' and reverse primer 5'-TCTAGCAGCTCGAGCTATCCAATATATTTAG 
TAAAGATACGACAA-3', and the nucleotide sequences of the GP ORFs 
were determined using standard procedures. 

Analysis of Growth Kinetics of VSV/GP-Uganda Escape Mutant 
Viruses 

Vero E6 cell monolayers in 24-well plates were inoculated in triplicate with 
VSV/GP-Uganda escape mutants or non-mutated virus at an MOI of 
0.00025 PFU/cell in the presence of varying concentrations of the correspond- 
ing mAbs. Aliquots of medium were collected every 12 hr and frozen for titra- 
tion at a later time. Titration of virus in aliquots was performed as above, 
without adding antibodies to the culture medium. 

Biolayer Interferometry Competition Binding Assay 

Biotinylated GP or GPAmuc (EZ-link Micro NHS-PEG 4 -Biotinylation Kit, 
Thermo Scientific #21 955) (1 i-ig/ml) was immobilized onto streptavidin-coated 
biosensor tips (ForteBio #18-5019) for 2 min. After measuring the baseline 
signal in kinetics buffer (KB; lx PBS, 0.01% BSA, and 0.002% Tween 20) 
for 2 min, biosensor tips were immersed into the wells containing primary anti- 
body at a concentration of 100 |.ig/ml for 10 min. Biosensors then were 
immersed into wells containing competing mAbs at a concentration of 
1 00 ^ig/ml for 5 min. The percent binding of the competing mAb in the presence 
of the first mAb was determined by comparing the maximal signal of 
competing mAb applied after the first mAb complex to the maximal signal of 
competing mAb alone. MAbs were judged to compete for binding to the 
same site if maximum binding of the competing mAb was reduced to <30% 
of its un-competed binding. MAbs were considered non-competing if 
maximum binding of the competing mAb was >70% of its un-competed bind- 
ing. A level of 30%-70% of its un-competed binding was considered interme- 
diate competition. 

Sequence Analysis of Antibody Variable Region Genes 

Total cellular RNA was extracted from clonal hybridomas that produced MARV 
antibodies, and RT-PCR reaction was performed using mixtures of primers de- 
signed to amplify all heavy-chain or light-chain antibody variable regions. The 
generated PCR products were purified and cloned into the pJet 1.2 plasmid 
vector (Thermo Scientific, #K1231) for sequence analysis. The nucleotide se- 



quences of plasmid DMAs were determined using an ABI3700 automated 
DNA sequencer. Heavy-chain or light-chain antibody variable region se- 
quences were analyzed using the IMGT/V-Quest program (Brochet et al., 
2008; Giudicelli et al., 201 1 ). The analysis involved the identification of germline 
genes that were used for antibody production, location of complementary 
determining regions (CDRs), and framework regions (FRs), as well as the num- 
ber and location of somatic mutations that occurred during affinity maturation. 

Statistical Analysis 

ECso values for neutralization were determined by finding the concentration of 
mAb at which a 50% reduction in plaque counts occurred after incubation of 
virus with neutralizing antibody. A logistic curve was fit to the data using the 
count as the outcome and the log-concentration as the predictor variable. 
The results of the model then were transformed back to the concentration 
scale. Results are presented as the concentration at the dilution that achieves 
a 50% reduction from challenge control with accompanying 95% confidence 
intervals. Each antibody was treated as a distinct analysis in a Bayesian non- 
linear regression model. 

Sample Preparation for EM Studies 

A Ravn strain MARV GP mucin-deleted construct (GPAmuc) was produced by 
stable cell line expression in Drosophila S2 cells, as described above. Human 
Fab proteins for MARV-specific antibodies were generated as described 
above. Fabs were added in molar excess to GPAmuc and allowed to incubate 
overnight at 4°C. Complexes then were purified by Superdex 200 size-exclu- 
sion chromatography in TBS. 

Electron Microscopy and Sample Preparation 

A 4 )il aliquot of each complex that had been diluted to a concentration 
of ~0.03 i-ig/ml with TBS buffer was placed for 15 s onto carbon-coated 
400 Cu mesh grids that had been plasma cleaned for 20 s (Gatan), blotted 
off on the edge of the grid, and then immediately stained for 30 s with 4 jil of 
2% uranyl formate. The stain was blotted off on the edge of the grid, and the 
grid was allowed to dry. Data were automatically collected with Leginon (Car- 
ragher et al., 2000; Potter et al., 1999; Suloway et al., 2005) using a FEI Tecnai 
F20 electron microscope operating at 120 keV with an electron dose of 
30 e“/A^ and a magnification of 52,000 x that resulted in a pixel size of 
2.65 A at the specimen plane when collected with Tietz CMOS 4k x 4k CCD 
camera. Particle orientations appeared to be generally isotropic, and images 
were acquired at a constant defocus value of —1 .0 |im at 0° stage tilt. 

Image Processing of Protein Complexes 

Particles were picked automatically using DoG Picker (34) and placed into a 
particle stack using the Appion software (Lander et al., 2009). Reference- 
free 2D class averages were generated with the Xmipp clustering 2D alignment 
software (van Heel et al., 1996) and sorted into an initial 300 classes. Non-GP 
particles were removed, and the stack was further subclassified into classes 
with ~100 particles per class in order to generate the final particle stack 
used for the reconstruction. Various numbers of class averages were chosen 
to create initial models using EMAN2 common lines software (Tang et al., 
2007). A model that best matched its projected classes was then used for 
refinement against the raw particle stack, imposing C3 symmetry, and the 
reconstruction was generated with ten rounds of refinement and increasingly 
smaller angular sampling rates with EMAN2 (Tang et al., 2007). All model fitting 
and manipulation was completed using UCSF Chimera (Pettersen et al., 2004). 

In Vivo Testing 

The animal protocol for testing of mAbs in mice was approved by the Institu- 
tional Animal Care and Use Committee of the University of Texas Medical 
Branch at Galveston. Seven-week-old BALB/c mice (Harlan) were placed in 
the ABSL-4 facility of the Galveston National Laboratory. Groups of mice at 
five animals per group were injected with individual mAbs by the intraperito- 
neal route twice: 1 hr prior and 24 hr after MARV challenge, using 100 ^ig 
per treatment. Untreated animals served as controls. For the challenge, 
mice were injected with 1,000 PFU of the mouse-adapted MARV strain Ci67 
by the intraperitoneal route. Animals were weighed and monitored daily over 
the 3-week period after challenge. Once animals were symptomatic, they 
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were examined twice per day. The disease was scored using the following 
parameters: dyspnea (possible scores 0-5), recumbency (0-9), unresponsive- 
ness (0-5), and bleeding/hemorrhage (0-5); the individual scores for each 
animal were summarized. 

ACCESSION NUMBERS 

EM reconstructions have been deposited in the Electron Microscopy Data 
Bank under the accession codes EMD-6232 through 6238. 
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The structures of Marburg virus 
glycoprotein in complex with a cross- 
reactive human antibody, as well as of the 
Ebola virus glycoprotein bound to the 
same antibody, reveal that there is a 
conserved epitope among filoviruses that 
overlaps with the putative receptor- 
binding site. These studies provide a map 
by which therapy with cross-reactive 
antibodies and inhibitors of entry could 
be developed. 
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SUMMARY 

The f iloviruses, including Marburg and Ebola, express 
a single glycoprotein on their surface, termed GP, 
which is responsible for attachment and entry of 
target cells. Filovirus GPs differ by up to 70% in pro- 
tein sequence, and no antibodies are yet described 
that cross-react among them. Here, we present the 
3.6 A crystal structure of Marburg virus GP in complex 
with a cross-reactive antibody from a human survivor, 
and a lower resolution structure of the antibody bound 
to Ebola virus GP. The antibody, MR78, recognizes 
a GP1 epitope conserved across the filovirus fam- 
ily, which likely represents the binding site of their 
NPC1 receptor. Indeed, MR78 blocks binding of 
the essential NPC1 domain C. These structures and 
additional small-angle X-ray scattering of mucin-con- 
taining MARV and EBOV GPs suggest why such 
antibodies were not previously elicited in studies 
of Ebola virus, and provide critical templates for 
development of immunotherapeutics and inhibitors 
of entry. 

INTRODUCTION 

The filovirus family includes Marburg virus and five ebolaviruses 
(Ebola, Sudan, Reston, Bundibugyo, and Ta'f Forest viruses), 
most of which cause highly lethal hemorrhagic fever and multiple 
outbreaks among humans. Among the filoviruses, Marburg virus 
was the first to be identified when it sickened laboratory workers 
in Europe in 1 967 (Malherbe and Strickland-Cholmley, 1 968; Sie- 
gert et al., 1968). Marburg virus has since re-emerged multiple 
times, with modern strains conferring greater lethality (~90%) 
(Geisbert et al., 2007; Towner et al., 2006). Sudan virus has 
caused at least six outbreaks between 1976 and 2013 (Albariho 
et al., 2013; Bowen et al., 1977; Sanchez and Rollin, 2005; Shoe- 
maker et al., 2012), Bundibugyo virus emerged in 2007 (Towner 
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et al., 2008; Wamala et al., 2010) and again in 2012 (Albariho 
et al., 2013), and Reston virus was found to infect ranches of 
swine being raised for human consumption in Asia in 2009 and 
2011 (Barrette et al., 2009; Pan et al., 2014; Sayama et al., 
2012). Ebola virus is typically found in Central Africa, but re- 
emerged in Western Africa in 201 4 to cause an outbreak unprec- 
edented in magnitude and geographic spread (WHO Ebola 
Response Team, 2014). During this outbreak, an experimental 
Ebola virus-specific monoclonal antibody (mAb) cocktail (Qiu 
et al., 2014) was used compassionately in several patients. No 
such treatment yet exists that could be used against Marburg 
virus or the other four ebolaviruses. 

Filoviruses express a single protein on their envelope sur- 
face, a glycoprotein termed GP, which is responsible for 
attachment to, and entry of, host cells (Sanchez et al., 1996). 
GP forms a trimer on the viral surface. In the trimer, each mono- 
mer is comprised of GP1 and GP2 subunits that are anchored 
together by a GP1-GP2 disulfide bond (Volchkov et al., 1998). 
GP1 contains a receptor-binding core topped by a glycan 
cap and a heavily glycosylated mucin-like domain (Lee et al., 
2008), while GP2 contains two heptad repeats and a transmem- 
brane domain. Filoviruses initially enter cells via macropinocyto- 
sis (Aleksandrowicz et al., 2011; Nanbo et al., 2010; Saeed 
et al., 2010; Mulherkar et al., 2011). Once in the endosome, 
the viral surface GP is cleaved by host cathepsins. Cleavage re- 
moves the mucin-like domains and glycan cap (Chandran et al., 
2005; Schornberg et al., 2006; Hood et al., 2010; Marzi et al., 
2012a; Brecher et al., 2012) and renders GP competent to bind 
the Niemann Pick Cl (NPC1) receptor (Carette et al., 2011; 
Cote et al., 201 1). Interestingly, Ebola virus entry requires cleav- 
age by cathepsin B (Chandran et al., 2005; Martinez et al., 2010; 
Schornberg et al., 2006), while Marburg virus entry is indepen- 
dent of cathepsin B (Gnirss et al., 2012; Misasi et al., 2012). 
The reasons underlying these differences are unknown. After 
enzymatic cleavage and receptor binding, the GP2 subunit 
unwinds from its GP1 clamp and rearranges irreversibly into a 
six-helix bundle (Malashkevich et al., 1999; Weissenhorn et al., 
1998a; Weissenhorn et al., 1998b) to drive fusion of virus and 
host membranes. 
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Antibody therapies recentiy have demonstrated effective 
post-exposure protection against fiioviruses in animai models 
(Dye et al., 2012; Marzi et al., 2012b; Olinger et al., 2012; Pettitt 
et al., 2013; Qiu et al., 2012; Qiu et al., 2014). MAbs can be pro- 
duced on large scale and offer more reproducible effects than 
polyclonal sera from survivors. However, most mAbs available 
only recognize Ebola virus. Very few are yet described against 
Marburg virus, and no antibodies are yet described that cross- 
react among the fiioviruses. Indeed, Marburg and Ebola GP 
are 72% different in protein sequence, and the fiioviruses are 
thought to be antigenically distinct. Further, there is no structure 
available for the unique Marburg virus GP, by which we may 
interpret differences in requirements for viral entry, or develop 
immunotherapeutics or inhibitors of entry. 

Here, we report the crystal structure of the trimeric, receptor- 
competent form of Marburg virus GP in complex with a neutral- 
izing antibody, termed MR78, that was identified in a recent 
human survivor of Marburg virus infection (Flyak et al., 2015). 
Atypically, MR78 cross-reacts to cleaved Ebola virus GP. An 
additional structure of MR78 in complex with Ebola virus GP 
illustrates the basis of the cross-reactivity: the antibody binds 
a hydrophobic “trough” at the top of GP1 , the sequence and 
structure of which are conserved across the fiioviruses. We pro- 
pose that this trough is the binding site of the critical domain C of 
the NPC1 receptor. Indeed, MR78 blocks binding of domain C 
to Marburg GP. Further, the extended third complementarity- 
determining region of the heavy chain (CDR H3) of MR78 mimics 
the glycan cap that shields this site on Ebola virus prior to entry 
and may mimic the receptor itself. These crystal structures plus 
additional biophysical analysis of complete, mucin-containing 
Ebola and Marburg GP ectodomains reveal that the receptor- 
binding site is masked on the surface of Ebola virus but more 
exposed on the surface of Marburg virus. These findings may 
explain why a cross-reactive antibody such as MR78 has not 
been identified in studies of Ebola virus. 

RESULTS 

Structure Determination 

Trimeric GP ectodomains for Marburg virus (MARV; strain 
Ravn) or Ebola virus (EBOV, also known as Ebola Zaire; 
strain Mayinga) were expressed in Drosophila S2 cells, with 
or without their mucin-like domains (GP and GPAmuc, respec- 
tively). MARV and EBOV GPAmuc were further proteolyzed 
by trypsin or thermolysin, respectively, to produce cleaved 
GP (GPcI) resembling the version of GP competent for receptor 
binding in the endosome (Figure SI A). Three hundred versions 
of MARV GP were engineered and complexed with 22 different 
mAbs in order to find a crystallizable combination. Hundreds 
of crystals of the final MARV GPcl-MR78 combination were 
grown and screened for X-ray diffraction: just one crystal 
yielded suitable diffraction. 

Diffraction to 3.6 A resolution was obtained from a single crys- 
tal of the MARV GPcI-Fab MR78 complex. The structure was 
determined by molecular replacement using EBOV GP and 
Fab KZ52 (Lee et al., 2008) as search models and was refined 
to R„ork of 24.7 % and Rfree of 27.9 % (Table SI). Four GP-Fab 
complexes are contained in the asymmetric unit: one complete 



trimer and one other monomer, which forms its biologically rele- 
vant trimer around a crystallographic 3-fold axis. 

Differences in GP Structure between EBOV and MARV 

Although the overall organization is similar between Marburg and 
Ebola GPs (1.8 A rmsd among 212 Ca atoms) (Figures 1A and 
IB), several structural differences exist that may explain their 
differing requirements for cellular entry. The first difference is 
that the intra-GPI disulfide bond formed by Cl 21 and Cl 47 in 
ebolavirus GP structures (Ebola [Lee et al., 2008] and Sudan 
[Bale et al., 2012; Dias et al., 201 1]) does not exist in MARV. In 
MARV, the two cysteines are replaced instead with LI 05 and 
HI 31 (Figure 1C and Figure SIB). As a result, the equivalent 
polypeptides, which form the crest of the receptor-binding sub- 
unit, differ in structure and flexibility. In the ebolaviruses, the 
polypeptide bearing Cl 47 (residues 145 to 150) turns inward, to- 
ward the trimer center to disulfide bond to Cl 21 . In MARV, the 
equivalent polypeptide (residues 129 to 134) turns outward into 
solvent, away from the trimer center. 

A second difference between MARV and the ebolaviruses lies 
at the base of the cathepsin cleavage loop. In MARV, these res- 
idues (1 72-1 80) form a clear alpha helix (a2), which packs against 
the outside of the GP2 fusion loop, interacting with both the N- 
and C-terminal strands of the fusion loop (Figure 1 D). In ebolavi- 
ruses, the equivalent residues predict to form a loop rather than a 
helix and are disordered (Bale et al., 2012; Dias et al., 2011; Lee 
et al., 2008). In MARV GP, the peptide connecting this a2 helix to 
pi 4 in the glycan cap would necessarily and immediately cover 
the both N- and C-terminal arms of the GP2 fusion loop, and if 
uncleaved, would hinder the conformational changes of fusion. 
Structural differences in a2 of MARV may prevent effective pro- 
cessing by cathepsin B. 

The third difference in the MARV GP structure lies at the N ter- 
minus, in the base of the p sheet that forms the GP1 spool, about 
which the metastable GP2 subunit is wound. In EBOV, the base 
of the spool connects to the anchoring GP1 -GP2 disulfide bond 
by a short stretch of polypeptide that intimately interacts with 
GP2. This short connecting polypeptide contains an N-linked 
glycan at Asn40, and also contains residue Asp47, which renders 
EBOV dependent on cathepsin Bfor entry (Misasi et al., 2012). In 
EBOV entry, cathepsin B removes an additional and critical 
1 kDa of mass from GP beyond that removed by cathepsin L, 
but the site and consequences of that extra cleavage event are 
not yet known. We propose that if cathepsin B cleaves this con- 
necting loop, EBOV GP2 would be freed from the constraints of 
the disulfide bond and better able to undergo the conformational 
rearrangements of fusion. Our crystal structure reveals that 
MARV, which is cathepsin B-independent, is structured differ- 
ently from EBOV at the same site. In MARV, the base of the 
GP1 spool is more mobile and is shifted toward the center of 
the trimer, inside of the fusion loop. Further, unlike in ebolavi- 
ruses (Dias et al., 2011; Bale et al., 2012), the polypeptide 
connection to the MARV GP1 -GP2 disulfide could not be visual- 
ized and the N-linked glycan is absent. The nearest glycan is 
instead attached to residue 171 on the MARV GP1 p sheet itself 
(Figure IE). These differences in sequence, glycosylation, 
mobility, and conformation likely allow MARV to be cleaved by 
other enzymes and render MARV cathepsin B-independent. 
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Figure 1. Structure of Marburg Virus GP 

(A) Crystal structure of MARV GPcI (GP1 , purple and GP2, dark gray) superimposed with the equivalent structure of EBOV (PDB ID, 3CSY; GP1 , blue; and GP2, 
light gray). The glycan cap of EBOV GP is deleted for clarity. The yellow box outlines the MR78 epitope and putative receptor-binding site. The black box outlines 
the interaction site of the MARV-specific helix a2 of GP1 (purple) with the fusion loop of GP2 (dark gray). The visible N-linked sugars on MARV and EBOV GPcI 
crystal structures are shown as dot models. MARV GPcI bears glycans at positions N94 and N171, which are not glycosylated in EBOV. See also Figure S1. 

(B) Top view of GP. 

(C) MARV GP lacks the intra-GP1 disulfide bond of EBOV. C147 of EBOV (blue) is replaced by H131 in MARV (purple), and the corresponding polypeptide traces 
outward from the trimer center. The orange box outlines the glycan attachment sites at the base of each GP. 

(D) Residues 1 72-1 80 of MARV form an a helix (a2) that packs against both N- and C-terminal arms of the fusion loop. In ebolaviruses, the equivalent residues are 
predict to form a loop rather than a helix and are disordered in crystal structures. 

(E) At the base of GP, MARV bears a glycan attached to N1 71 while EBOV bears a glycan attached to N40 (drawn as an oval as it was not included in the EBOV 
crystal structure). 



Overall Organization of the MARV or EBOV GPcI Bound 
to Fab MR78 

The crystal structure of MARV GPcI in complex with the Fab frag- 
ment of MR78 indicates that MR78 binds the membrane-distal 
head of GP1 (Figure 2A). We determined an additional, low-res- 
olution structure of EBOV GPcI bound to both MR78 and KZ52. 
The ternary EBOV complex, determined by molecular replace- 
ment, demonstrates that the MR78 antibody recognizes a similar 
site on both MARV and EBOV (Figures 2B, S2, and Table SI). 
MR78 binds into a highly conserved hydrophobic trough re- 
vealed at the top of the EBOV GP1 core, after removal of the 
glycan cap by proteolytic cleavage in the endosome. Although 
MARV and EBOV diverge significantly in sequence overall, resi- 
dues contained in this site, the MR78 epitope, are 85% similar 
between the viruses (Figures 3A and SI B). 

Likely Receptor-Binding Site 

The location and structural conservation of this site suggest that 
it could be the binding site of the NP01 receptor, used by all 
known filoviruses (Oarette et al., 2011; Oote et al., 2011; Miller 
et al., 2012; Ng et al., 2014). Indeed, in ELISA, MR78 inhibits 
binding of NP01 domain 0 to MARV GP (Figure S3A). This 



site, at the apex of cleaved GP1, resembles an ocean wave 
morphology, with a lower trough beneath a rising crest. The 
trough is hydrophobic and is formed by al, p4 and the loop 
that connects them (residues 63-74 in MARV). It is 22 A wide 
and 8 A deep at F72. The crest is hydrophilic, includes charged 
residues previously identified as essential for virus entry (Dube 
et al., 2009; Manicassamy et al., 2005; Manicassamy et al., 
2007), and is formed by strands p7, p9 and their connecting 
loops (residues 92-106 and 120-134 in MARV). The 120-134 
loop contains H131, which replaces the cysteine and the intra- 
GP1 disulfide bond of EBOV (Figure 3B). 

Flere, we show by ELISA that a Q128S and N129S double 
mutant in MARV GP abrogates binding to NPC1 domain C (Fig- 
ure S4A). 0128 and N129 are at the tip of the crest and could 
make direct hydrophilic interaction with NPC1 . The trough itself 
is formed by hydrophobic side chains, such as F72 (equivalent to 
F88 in EBOV). Also forming the trough are the main chains of hy- 
drophilic residues; these polar side chains reach away from the 
trough into the trimer to make key stabilizing contacts to GP2. 
Two examples are R73 and K79, previously shown to be essen- 
tial for MARV infectivity (Manicassamy et al., 2007). In the crystal 
structure, R73 makes multiple hydrogen bonds to the fusion loop 
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Figure 2. MR78 Binds Both MARV and EBOV 
GPcI at the Apex of GP1 

(A) 3.6 A crystal structure of MARV GPcI in com- 
plex with Fab MR78. Each GP1 is colored a 
different shade of purple, GP2 is gray, and the 
MR78 Fab is in yellow. 

(B) 8 A structure of EBOV GPcI in complex with Fab 
MR78, determined by molecular replacement and 
rigid body refinement. Each EBOV GP1 is colored 
a different shade of blue and GP2 is gray. See also 
Figure S2. Fab MR78 (yellow) binds the apex of 
GP1 of both viruses. 
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of the NPC1 receptor itself, as domain C 
contains similar Phenylalanine and Tyro- 
sine residues that are essential for bind- 
ing GP (Ndundo and Chandran, personal 
communication). Further, F72 in MARV, 
which is equivalent to F88 in EBOV, inter- 
acts with CDR H3 in the bottom of the 
trough. Both F72 of MARV and F88 of 
EBOV are critical for attachment and entry 
(Martinez et al., 2013; Mpanju et al., 2006) 
and may interact directly with essential 
hydrophobic residues of domain 0. The 
binding mode of MR78 is reminiscent of 
anti-influenza virus human mAbs in which 
long ODR H3s similarly reach into the con- 



of the neighboring protomer in the trimer (Figure S4B) and likely 
plays a key role in maintaining the prefusion structure or trans- 
mitting a conformational change to the fusion loop after receptor 
binding. K79 interacts with the main chain of residues 574-577 of 
GP2 (Figure S4C), residues that connect the separated helical 
segments of the first heptad repeat. We propose that binding 
of NPC1 domain C involves contact with the hydrophilic crest 
and hydrophobic trough, and that binding in the trough may 
transmit conformational changes to GP2 via R73 and K79 (equiv- 
alentto R89 and K95 in EBOV). Although MR78 binds both MARV 
and EBOV GPcI, it only outcompetes NPC1 domain 0 for binding 
of MARV GPcI (Figure S3B). MR78 may have lower affinity for 
EBOV GPcI than MARV GPcI or domain 0 may bind the GPs 
slightly differently. 

GP-MR78 Interactions 

The interaction surface between the MR78 antibody and MARV 
GP buries 976 A^ of molecular surface and is primarily hydropho- 
bic. Contact is mediated by both the heavy and light chains, but 
the primary region of interaction is the 17-residue CDR FI3 (Fig- 
ures 30 and S3CD), which penetrates the hydrophobic trough 
in MARV GPl. In this interaction, F1 11. 2 and Y112.2 of the 
CDR H3 interact with P63, S67, W70, F72, 195 and 1125 of 
MARV GP (IMGT numbering, Figure 4A). 

Notably, these interactions are similar to those made by the 
Ebola virus glycan cap, which occupies this site prior to enzy- 
matic cleavage in the endosome. In Ebola virus, the equivalent in- 
teractions are made by F225 and Y232 of the EBOV glycan cap 
interacting with P80, T83, W86, F88, L1 11 and VI 41 on EBOV 
GP (Figure 4B). Similarity may even extend to the key domain 0 



served receptor-binding site (Barbey-Martin et al., 2002; Bize- 
bard et al., 1995; Hong et al., 2013; Lee et al., 2014; Schmidt 
et al., 2013; Whittle et al., 2011; Xu et al., 2013). In many cases 
those influenza mAbs also use Phe or Tyr aromatic residues to 
interact with an aromatic residue in the viral receptor binding 
domain, suggesting that the favorable energetics and inter- 
molecular interactions of common aromatic molecules may 
constitute a canonical mode of binding of antiviral antibodies to 
recessed receptor-binding sites. 

Although the MR78 epitope is largely conserved in sequence 
and structure between MARV and EBOV, it differs in its exposure 
at different stages of virus entry. MR78 binds MARV GP equally 
well whether MARV GP is in its uncleaved, viral-surface form or 
its cleaved, endosomalform. In contrast, MR78 does not bind un- 
cleaved EBOVGP. It only binds the endosomal, cleaved form from 
which the glycan cap has been removed. Together, these results 
suggest that in EBOV, the glycan cap effectively blocks the MR78 
epitope and putative receptor-binding site on the (uncleaved) viral 
surface, but that in MARV, the epitope and at least part of the 
receptor-binding site is fully exposed on the viral surface. Better 
exposure of this site may explain why antibodies against the puta- 
tive receptor-binding site were elicited by MARV infection (see 
companion paper by Flyak et al., 201 5), but seem to be more rarely 
elicited and have not yet been described against EBOV. 

Differences in Mucin-Like Domains between MARV and 
EBOV, and Possible Effect on Antibody Reactivity 

In addition to a glycan cap, the GP spike on the viral surface in- 
cludes three heavily glycosylated mucin-like domains that are 
~75 kDa each in mass and are predicted to have little secondary 
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Figure 3. MR78 Recognizes a Conserved Epitope at the Apex of Cleaved GP1 

(A) Conservation of the MR78 epitope among filovirus GPs, mapped onto one monomer of MARV GPcI. Sequence alignment was performed in ebolavirus {Ebola, 
Sudan, Reston, Tai' Forest, Bundibugyo), marburgvirus (Musoke, Angola, Popp, Ci67, DRC1999, Ravn), and cuevavirus (Lloviu) genuses. Residues identical 
across the filoviruses are colored red; residues that possess strong similarity, magenta; weak similarity, pink; no similarity, gray. 

(B) The apex of cleaved MARV GP1 , where Fab MR78 binds, forms a wave crest-and-trough morphology (magenta). The hydrophilic crest and the hydrophobic 
trough each contain residues previously shown to be critical for virus entry (Dube et al., 2009; Manicassamy et al., 2005; Manicassamy et al., 2007; Mpanju et al., 
2006). The diagonal black line indicates the base of the trough. See also Figure S4. 

(C) Surface representation of the interface between one monomer of MARV GPcI (bottom) and Fab MR78 (top). CDR HI is colored red; CDR H2, orange; CDR H3, 
purple; CDR LI , blue; CDR L2, green; CDR L3, forest green. The footprint on MARV GPcI is colored according to the CDR that mediates the contact. GP residues 
contacted by MR78 are indicated and colored according to the CDR that mediates the contact (CDR names in parentheses). 



structure. All mucin-contalning GPs thus far have been refractory 
to crystallization. In order to visualize the native glycoprotein ec- 
todomain and position of the mucin-like domain relative to the re- 
ceptor-binding core, we turned to Small-Angle X-ray Scattering 
(SAXS) in solution. SAXS data collected for mucin-containing 
EBOV or MARV GP trimers indicate that the mucin-like domains 
of both viruses are large and extend outward from the GP core. 
The radius of gyration, Rq, for mucin-deleted and mucin-con- 
taining MARV GPs are 50 and 72 A, respectively, and maximum 
dimension, Dmax, for mucin-deleted and mucin-containing GPs 
are 160 and 250A, respectively, indicating that the mucin-like 
domain of MARV widens the molecule up to 90 A (Figures 5A 
and S5). The mucin-like domains of MARV are a bit larger than 
those of EBOV (67 A Rq and 225 A D^ax for mucin-containing 
EBOV GP), consistent with their greater volume determined by 
SAXS (Figure S5G) and mass noted by SDS-PAGE (Figure S5D). 
The mucin-like domains of EBOV appear to project more upward 
(consistent with EM tomography [Tran et al., 2014]), while 
those of MARV appear to project less upward, more equatorially, 
and to cover the sides of the GP trimer (Figures 5A and 5B). 
Although the mucin-like domains are likely flexible (see Porod- 
Debey coefficient P in Figure S5G), an equatorial, rather than 
upward projection is consistent with attachment points of the 
mucin-like domain to both GP1 and GP2 in MARV. In EBOV, 
a different position of the furin cleavage site results in all of 
mucin-like domain being attached to GP1. The MARV GP2 
portion of the mucin-like domain, residues 436-509, is attached 
to residue 51 0 on the side of the MARV GP trimer, but is flexible 
and disordered. 



A differing position of the mucin-like domains between MARV 
and EBOV would leave different surfaces exposed for immune 
recognition. The equatorial projection of the MARV mucin-like 
domain, for example, would leave the expected receptor-bind- 
ing site at the top more accessible on MARV than EBOV, and 
further supports the notion that antibodies against the expected 
receptor-binding site would be more likely to be elicited using 
marburgvirus antigens than ebolavirus antigens. The accompa- 
nying paper (Fiyak et al., 2015) and other immunization studies 
(Qiu et al., 201 1 ; Wilson et al., 2000) support this notion. 

In contrast, on EBOV, the upward projection of the mucin-like 
domains and the absence of mucin attached to EBOV GP2 
would leave the EBOV base more exposed for antibody surveil- 
lance, compared to that of MARV. Indeed, in the accompanying 
paper by Fiyak et al., none of the 18 neutralizing antibodies 
raised against MARV appear to bind the base of the MARV GP, 
while multiple neutralizing antibodies elicited by ebolaviruses 
are known to bind, or thought to bind, to the base of ebolavirus 
GP (Dias et al., 2011; Lee et al., 2008; Qiu et al., 2012; Murin 
et al., 2014) (Figure 50). 

DISCUSSION 

In summary, the crystal structures and accompanying experi- 
ments indicate that MR78 binds a conserved site on the apex 
of GP1 that is available on the surface of MARV GP, but masked 
on EBOV GP prior to enzymatic cleavage. The epitope of MR78 
likely overlaps with the receptor-binding site, and hydrophobic 
contacts made by CDR FIS to the hydrophobic trough may mimic 
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Figure 4. Similarity in Recognition of the Pu- 
tative Receptor-Binding Site by MR78 and 
the Ebola Virus Glycan Cap 

(A) The CDR H3 of MR78 (yellow) reaches Into the 
hydrophobic trough of GP1 (purple). Ft 1 1 .2 and 
Y112.2 of CDR H3 Interact with P63, S67, W70, 
F72, I95, and I125 of MARV GP. 

(B) Similar residues of the EBOV glycan cap (light 
blue) bind into this trough on the surface of EBOV 
GP (blue), prior to enzymatic cleavage. Here, F225 
and Y232 of the glycan cap interact with P80, T83, 
W86, F88, L1 1 1 , and V1 41 in the trough (PDB ID; 
3CSY). 




those of the as-yet-unvisualized NPC1 domain C. MR78 does not 
neutralize authentic EBOV, likely because its epitope is masked 
on the EBOV surface by the mucin-like domain and glycan cap 
on the virus surface. MR78 does, however, neutralize authentic 
MARV (Flyak et al., 2015) and could be a valuable monoclonal 
antibody therapeutic against this extremely lethal virus. Impor- 
tantly, no mAb therapeutic yet exists against MARV, and few 
mAbs are yet known against MARV from which such a therapeu- 
tic could be developed. The crystal structure of MARV GP pre- 
sented here, and the highly conserved MR78 epitope, provide 
strategies for immunotherapy and templates for development 
of potentially broad-spectrum inhibitors of filovirus entry. 

EXPERIMENTAL PROCEDURES 

Construction, Expression, and Purification of MARV/EBOV GP 

DNA encoding the MARV GPAmuc ectodomain (residues 1-636 with a mucin 
deletion of residues 257-425), point mutants of MARV GPAmuc and the 
EBOV GPAmuc ectodomain (residues 1-637 with a mucin deletion of residues 
314-462) were amplified by PGR using codon-optimized and whole-gene syn- 
thesized MARV or EBOV GPs as templates. Four point mutations in MARV 
GPAmuc, F438L, W439A, F445G, and F447N, on GP2, located around thefurin 
cleavage site were found to improve the efficiency of furin cleavage. GP con- 
structs were cloned into a derivative of the expression vector pMT. This deriva- 
tive vector contains the puromycin resistant gene and a C-terminal double-strep 
tag sequence. Expression plasmids were transfected using Effectin (QIAGEN) 
into 80% confluent Drosophila Schneider S2 cells. The cells were first cultured 
in complete Schneider’s medium supplemented with 10% (v/v) FCS (LONZA), 
and were adapted to Insect Xpress medium by progressively modifying the 
Schneider/Insect Xpress medium ratio with 6.0 [xg/ml puromycin. Large-scale 
expression of the MARV/EBOV GPAmuc was performed using stable S2 cell 
lines in 2 I Erlenmeyer flask at 27.0°C, induced with 0.5 mM CUSO4. Superna- 
tants containing the expressed proteins were harvested 4 days after induction, 
and mixed with the Strept-Tactin affinity column binding buffer (100 mM 
Tris-HCI, 150 mM NaCI, 1 mM EDTA, 15 ^ig/ml Avidin [pH8.0]). The proteins 
were purified via Strept-Tactin affinity, followed by Superdex 200GL 10/300 
(GE Healthcare Life Sciences) size-exclusion chromatography (S200 SEC). 



Preparation and Crystallization of GP- 
Antibody Complexes 

To mimic endosomal protease cleavage and pro- 
duce MARV GPcI, MARV GPAmuc was incubated 
with 0.01 mg trypsin at 37°C for 1 hr in 20 mM 
TBS [pH 8.0], 100 mM NaCI. The reaction was 
stopped using 0.5 mM 4-(2-Aminoethyl) benzene- 
sulfonyl fluoride hydrochloride (AEBSF), and the 
protein was purified by S200 SEC. EBOV GPcI 
was produced by incubating EBOV GPAmuc 
with 0.02 mg thermolysin overnight at room tem- 
perature in 20 mM TBS [pH 7.5], 100 mM NaCI, 1 mM CaCl2, and purified by 
S200 SEC. Hybridoma cells expressing the human MR78 antibody were 
generated from peripheral blood mononuclear cells (PBMCs) from a donor, 
who contracted MARV infection in the Python Cave in Queen Elizabeth Na- 
tional Park, Uganda in 2008 (see Flyak et al., 2015). MR78 was expressed in 
serum-free medium (Hybridoma-SFM, GIBCO), and culture supernatants 
were centrifuged, sterile-filtered, and purified over HiTrap Protein G columns 
(GE Healthcare Life Sciences). Fab fragments were generated by standard 
papain digestion, with released Fc and undigested IgG removed by Protein 
A chromatography, and remaining Fab fragments further purified by MonoQ 
ion-exchange chromatography. For crystallization, purified MARV GPcI was 
mixed with excess Fab MR78 for 2 days at 4°C. Complexes were separated 
from unbound Fab via S200 SEC. Crystals were grown by hanging-drop va- 
por diffusion at 20°C using 0.8 ^l protein (13.0 mg ml“\ in 20 mM Tris-HCI 
[pH 8.0], 100 mM NaCI) and 0.8 |.il of mother liquor (100 mM NaCI, 50 mM 
MES [pH 6.5], 13 % PEG 4000, 0.5 % ethyl acetate). These crystals were cry- 
oprotected with 25% glycerol plus mother liquor before flash cooling in liquid 
nitrogen. One crystal diffracted to a resolution of 3.6 A. EBOV GPcI was com- 
plexed with Fabs KZ52 and MR78 and crystallized using hanging-drop vapor 
diffusion at 20°C with 1.0 i.lI of protein (6 mg/ml, 150 mM NaCI, 10 mM Tris 
[pH 7.5]) and 1.0 ^il of mother liquor (100 mM NaAcetate [pH 4.6], 200 mM 
NH4S04, 10% PEG 3350, 2% PEG 400). The crystals were then cryopro- 
tected by washing in 100 mM NaAcetate [pH 4.6], 200 mM NH4S04, 12% 
PEG 3350, 10% PEG 400, 10% ethylene glycol. Only diffraction to 8 A 
was obtained, but this data permitted molecular replacement using Phaser 
(McCoy et al., 2007) and EBOV GP and KZ52 (Lee et al., 2008) as search 
models. 

ACCESSION NUMBERS 

Coordinates and structure factors have been deposited into the Protein Data 
Bank under the accession code 3X2D. 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures, five 
figures, and one table and can be found with this article online at http://dx. 
doi.org/1 0.1 01 6/j.cell.201 5.01 .041 . 
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Figure 5. MARV and EBOV Present Different Surfaces for Antibody Recognition 

(A and B) Molecular envelopes of mucin-containing MARV and EBOV GP ectodomains determined by SAXS. Rendered Gaussian distributions of molecular 
envelopes are illustrated in light gray, with ribbon models of the crystallized MARV GPcI and EBOV GPAmuc trimers to scale and overlaid for comparison. The 
trimers are illustrated as ribbons. Note that the glycan cap was removed from MARV GP used in crystallization in order to improve diffraction but was contained in 
the complete MARV GP used for SAXS. The glycan cap did not inhibit diffraction of EBOV GP and is included in the EBOV GP crystal structure. MARV GPcI is 
colored in purple (GP1) and gray (GP2). EBOV GPAmuc is colored blue (GP1), white blue (GP1 glycan cap), and gray (GP2). MARV GP is drawn in two possible 
orientations because definitive placement of polypeptide is challenging at this resolution. In either orientation however, the mucin-like domains of MARV project 
sideways, equatorially or downward from the core of GP. In MARV, the mucin-like domain is attached to both GP1 and GP2. By contrast, in EBOV, the mucin-like 
domain is attached solely to GP1 , there is no anchor at the base. Both these SAXS experiments and previous electron tomography (Tran etal., 2014) agree on the 
upward projection of the mucin-like domains in EBOV. See also Figure S5. 

(C) Differing positions of the mucin-like domains between MARV and EBOV may lead to elicitation of different types of antibodies. The lower position and GP2 
anchor of the mucin-like domain of MARV may better mask the base of GP but expose its upper surfaces, allowing antibodies like mAb MR78 to be elicited. The 
upward projection of the EBOV mucin-like domain and absence of any GP2 anchor, appear to better mask upper surfaces, but expose the base, allowing 
antibodies such as KZ52 (Lee et al., 2008), 2G4, 4G7 (Murin et al., 2014), and 16F6 (directed against Sudan ebolavirus [Dias et al., 201 1 ; Bale et al., 2012]) to be 
elicited. 
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SUMMARY 

The breakage-fusion-bridge cycle is a classical 
mechanism of telomere-driven genome instability in 
which dysfunctional telomeres are fused to other 
chromosomal extremities, creating dicentric chro- 
mosomes that eventually break at mitosis. Here, 
we uncover a distinct pathway of telomere-driven 
genome instability, specifically occurring in cells 
that maintain telomeres with the alternative length- 
ening of telomeres mechanism. We show that, in 
these cells, telomeric DMA is added to multiple 
discrete sites throughout the genome, correspond- 
ing to regions regulated by NR2C/F transcription 
factors. These proteins drive local telomere DNA 
addition by recruiting telomeric chromatin. This 
mechanism, which we name targeted telomere inser- 
tion (TTI), generates potential common fragile sites 
that destabilize the genome. We propose that TTI 
driven by NR2C/F proteins contributes to the forma- 
tion of complex karyotypes in ALT tumors. 

INTRODUCTION 

Cancer is characterized by genomic aiterations that lead to 
oncogene activation and/or tumor suppressor loss. These 
changes accumulate during tumor development and can be de- 
tected as translocations, amplifications, or deletions of chromo- 
somal segments. Another major characteristic of cancer cells is 
their unlimited proliferative potential. This feature is dependent 
on the activation of a telomere maintenance mechanism upon 
exit from crisis (Hanahan and Weinberg, 201 1 ). During cell crisis, 
genome instability and telomere dysfunction have been primarily 
linked through the classical mechanism of breakage-fusion- 
bridge cycle (BFB) first described by Barbara McClintock 
(McClintock, 1941). In this chain of events, unprotected or 
broken telomeres can fuse to another chromosomal extremity 
via non-homologous end joining. Fusions create di-centric chro- 
mosomes that eventually break at random positions during 
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mitosis, generating deletions and amplifications of chromosomal 
segments and more unprotected chromosome ends (Murnane, 
2012). This cycle persists until chromosomal extremities get sta- 
bilized by telomere addition via telomerase activation. In human, 
telomerase is activated in the majority of cancers. However, in a 
subset of tumors, mostly sarcomas, telomeres are maintained by 
a recombination/amplification mechanism termed alternative 
lengthening of telomeres (ALT) (Bryan et al., 1997). These tumors 
typically harbor highly heterogeneous and complex karyotypes 
(Taylor et al., 201 1). The lack of apparent specific translocations 
makes it challenging to identify the mechanism driving tumori- 
genesis in these cancers. Efforts to characterize these tumors 
are thus currently limited to identifying specific gene expression 
signatures (Chibon et al., 201 0). Similarly, the mechanism under- 
lying ALT activation and maintenance in these tumors is un- 
known. Because tumors in which telomerase is inhibited can 
activate ALT in mouse models (Hu et al., 2012), it is critical to 
dissect this pathway to design efficient anti-cancer therapies 
targeting telomere maintenance. We previously showed that 
orphan nuclear receptors of the NR2C/F classes (TR2, TR4, 
COUP-TF1 , COUP-TF2, and EAR2), which belong to the nuclear 
hormone receptor (NHR) family of transcription factors, are aber- 
rantly associated with telomeres in a prototypic ALT(+) cell line 
(Dejardin and Kingston, 2009). This finding was unexpected, as 
transcription factors usually associate with gene regulatory re- 
gions, and telomeres do not contain classical genes. Here, we 
address the biological relevance of this finding. We identify a crit- 
ical role for these proteins in the ALT process and in active desta- 
bilization of the genome. We dissect the mechanism leading to 
their aberrant recruitment to telomeres, and we show that these 
proteins have a major architectural role: NR2C/F proteins can 
bridge together bound loci in the nuclear space. By promoting 
spatial proximity, NR2C/F proteins favor the telomere-telomere 
recombination necessary for ALT maintenance. Surprisingly, 
NR2C/F-driven spatial proximity also induces the tethering of 
telomeric chromatin to hundreds of regular NR2C/F-binding 
sites throughout the genome. This abnormal organization trig- 
gers the insertion of telomeric material to these sites, and this 
process depends on NR2C/F proteins. Insertions of telomeric 
DNA throughout the genome lead to the creation of potential 
common fragile sites that are known to be prone to breakage 
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Figure 1. Analysis of the Binding Profile of NR2C/F Factors at Telomeres by ChIP Sequencing 

(A) Normalized amount of telomeric reads in ALT(— ) (pink) and ALT(+) (red) iibraries prepared from input DNA, TRF2, NR2F/C2, and FIMBOXI iPs (dashed line 
dispiays normalized amount in input libraries). 

(legend continued on next page) 
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and translocations. Because this mechanism of telomere-driven 
genome instability is fundamentally distinct from the BFB cycles 
and occurs as the consequence of the activation of a telomere 
maintenance mechanism, we name it targeted telomere inser- 
tions (TTI). In line with the role played by NR2C/F factors in TTI, 
we show that these proteins associate with telomeres in primary 
ALT tumors in situ and that this association correlates with the 
extent of karyotype rearrangements. Therefore, NR2C/F have a 
critical role in ALT and in the activation of TTI. We propose that 
this mechanism of telomere-driven genome instability induces 
heterogeneous genomes and contributes to the generation of 
complex karyotypes in ALT sarcomas (Taylor et al., 2011). 

RESULTS 

NR2C/F Factors Bind to Direct Repeats of a Variant 
Telomeric Motif 

To get insights into the significance of NR2C/F binding to telo- 
meres, we first examined how these factors are recruited. A 
mutant NR2C2 protein with point mutations disrupting DNA 
binding (Tanabe et al., 2007) fails to accumulate at telomeres 
(Figures SI A and SIB), suggesting that NR2C/F directly bind 
to DNA. NHR usually associate with DNA as dimers by bind- 
ing to a composite sequence made of two half-sites (the 
5'-A/GGGTCA-3' motif). Depending on the mutual orientation 
and spacing of these half-motifs, the full binding site varies 
extensively (Sandelin and Wasserman, 2005). Since the NFIR 
half-site is related to the canonical telomere basic repeat unit 
5'-GGGTTA-3', we hypothesized that NR2C/F could be recruited 
to ALT telomeres through binding to an iteration of the naturally 
occurring variant 5'-GGGTCA-3' (Allshire et al., 1989).We thus 
analyzed by chromatin immunoprecipitation (ChIP) combined 
with high-throughput sequencing the DNA sequences associ- 
ated with NR2C2 and NR2F2, those associated with the canon- 
ical telomere-binding protein TRF2 (de Lange, 2005), and with 
FIMBOX1 , another DNA-binding protein that we previously iden- 
tified at telomeres regardless of the maintenance mechanism 
(Dejardin and Kingston, 2009). This analysis was performed in 
both ALT(+) and in ALT(-) cell lines (WI-38 VA13 2RA and 
HeLa 1.2.11 cells, respectively) using validated antibodies (Fig- 
ures S2A, S4A, and S7A). While TRF2 and HMBOX1 are enriched 
at both types of telomeres, NR2C2/F2 proteins bind only to 
ALT(+) telomeres (Figures 1 A and S2A), consistent with our orig- 
inal findings (Dejardin and Kingston, 2009). Even when overex- 
pressed, these factors cannot be detected at ALT(-) telomeres 
(Figure SID), ruling out an effect due to differences in the expres- 



sion level. To further characterize the binding mode of these 
factors, immunoprecipitated sequences were categorized ac- 
cording to their content in the canonical telomere motif GGGTTA 
(from one to eight occurrences in each 50-nt-long sequencing 
read). The canonical telomere motif has a similar distribution in 
the TRF2 and HMBOX1 libraries, implying similar binding speci- 
ficity in both cell types (Figure 1 B). In these libraries, the majority 
of enriched reads contains seven or eight occurrences of the 
canonical motif, suggesting that TRF2 and FIMBOX1 bind to 
the canonical telomere sequence in vivo, as expected (Bilaud 
et al., 1997; Broccoli et al., 1997; Kappei et al., 2013). In contrast, 
reads associated with NR2C2/F2 showed a different distribution 
of GGGTTA occurrences, indicating a distinct binding specificity 
(Figure IB). To characterize the motifs that allow specific 
NR2C2/F2 binding, we analyzed these reads further (red 
brackets in Figure IB) and found that, among all possible se- 
quences, the GGGTCA motif was specifically enriched in 
NR2C2/F2 reads (Figure 1C). This is in agreement with the clas- 
sical sequence-specific binding mode for NHR and our original 
hypothesis (Benoit et al., 2006; Dejardin and Kingston, 2009; 
Conomos et al., 2012). Although less frequent (by ~6-fold), the 
GGGTCA motif is also present in ALT(-) telomeres, indicating 
that the simple presence of the motif is not sufficient to promote 
NR2C/F recruitment. Thus, we analyzed the occurrence of the 
NHR motif in telomeric reads. GGGTCA is essentially found as 
a multimer in ALT(+) sequences and as a monomer in ALT(-) 
reads (Figure 1 D), suggesting that NR2C/F cannot be recruited 
to single GGGTCA motif at telomeres. Since these data suggest 
a classical binding mode for NR2C/F factors, we searched for 
the full binding sites for these proteins. We identified the direct 
repeats DRO, DR6, and DR7 (two half-sites in the same orienta- 
tion and separated by 0, 6, or 7 nucleotides) as the major NR2C/ 
F-binding sites at telomeres (Figure 1 E), and these sites are, at 
least for DRO, ~80-fold enriched in ALT(+)com pared to ALT(-) 
telomeric DNA. Therefore, NR2C/F recruitment is promoted by 
the presence of DRO, 6, and 7 motifs specifically in ALT 
telomeres. 

The Telomere Protein TRF2 Binds to Hundreds of 
NR2C/F Regions throughout the Genome of ALT Cells 

The aberrant recruitment of NR2C/F factors could suggest that 
telomeres potentially act as “molecular sinks” for these tran- 
scription factors in ALT cells. Titration could impinge on the bind- 
ing and the regulation of NR2C/F targets, which would indirectly 
control ALT and/or tumorigenesis (Safe et al., 2014). Therefore, 
we analyzed the genome-wide binding profile of NR2C2 and 



(B) TTAGGG content of telomeric reads in IPs from ALT(— ) (left) and ALT(+) (right) cells. Histograms display for each library the percentages of telomeric reads 
containing one to eight TTAGGG occurrences. Red brackets highlight subsets of telomeric reads containing three or four TTAGGG occurrences that are strongly 
enriched in ALT(+) libraries. 

(C) Percentage of telomeric reads containing the indicated repeat variant in ALT(+) and ALT(— ) libraries prepared from input DNA, TRF2, NR2F/C, and HMBOX1 
IPs. Variant repeats were identified within telomeric reads that contained three or four TTAGGG occurrences and were sorted based on their relative amount in 
telomeric reads from ALT(+) NR2C/F libraries. 

(D) Pie charts showing that GGGTCA multimerization is specific for ALT(+) telomeres. The charts display the number of telomeric reads containing one to seven 
GGGTCA (left) or NHR unrelated variant "GGGTTG” (right) occurrences (n indicates the number of telomeric reads containing the GGGTCA or GGGTTG variants 
in ALT(+) and ALT(— ) input libraries). GGGTTG is used as a control to show that multimerization is specific for the GGGTCA motif. 

(E) Normalized number of telomeric reads containing GGGTCA DRO, DRO, DR7, and DR1 2 in libraries prepared from ALT(— ) input and ALT(+) input and orphan 
receptors NR2F/C IPs. 

See also Figures S1 and S2. 
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Figure 2. TRF2 Binds to Hundreds of Loci throughout the Genome in ALT Cells 

(A) Overlap between TRF2- and NR2F/C-binding sites genome wide. (Top) Venn diagrams displaying peak overlap (values indicate the number of peaks). 
(Bottom) Average read densities in NR2F/C libraries relative to the TRF2 peaks. 

(B) Density profiles of input, TRF2, and NR2F/C reads in two representative loci in ALT(— ) and ALT(+). TRF2 is only bound in ALT(+). 

(C) Chromosomal locations of ALT(+) TRF2 peaks, ALT(+) NR2C/F peaks overlapping with TRF2 peaks, and ALT(-)NR2C/F peaks. 

(legend continued on next page) 
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NR2F2 and concluded that ALT(+) telomeres do not titrate these 
transcription factors since we identify more binding regions in 
ALT(+) than in ALT(-) genomes (Figure 2A). We also analyzed 
the genomic distribution of TRF2, as it supposedly binds not 
only to telomeres, but also to rare interstitial telomeric se- 
quences (ITS) (Simonet et al., 201 1). ITS contain a various num- 
ber of iterations of the GGGTTA motif explaining TRF2 
recruitment and are well-characterized common fragile sites 
(CFS) (Bosco and de Lange, 2012). Surprisingly in ALT(+) cells, 
we identified several hundreds of TRF2-binding sites (Figure 2A) 
that are not mapping to known ITS. However, most of these sites 
(75%) overlap with binding sites for NR2C2/F2 proteins (Figures 
2A and 2B). Importantly, none of these regions were bound by 
TRF2 in ALT(-) cells, pointing to an ALT-specific binding mode 
for TRF2 (Figure 2B). In contrast, in ALT(-) cells, TRF2 binds 
only to 45 regions, 20 of which corresponded to known ITS often 
located in subtelomeric regions (Simonet et al., 2011) (Figures 
2A-2C). Interestingly, when we looked at the position of these 
sites with respect to genes, we found that the NR2C2 regions 
that also recruit TRF2 have a broad distribution and are usually 
located far from gene promoters (Figure 2D). In contrast, the 
TRF2-negative NR2C2 regions are principally located at gene 
transcription start sites. Moreover, the sequence content of 
these two populations is different. The classical promoter-bound 
NR2C2 regions are enriched in the expected motifs, in particular 
the ETS binding sequence (O’Geen et al., 2010) (Figure 2E). On 
the other hand, the non-genic NR2C2 regions bound by TRF2 
lack the ETS motif but are highly enriched in the GGGTCA motif 
(81 % reads). Remarkably, most of these sites (~87%) also lack 
the canonical telomere sequence, excluding a classical DNA- 
mediated recruitment mechanism for TRF2. Thus, our data point 
to an unusual recruitment mode for TRF2 throughout the 
genome of ALT cells. 

NR2C/F Proteins Induce Spatial Proximity of Their 
Binding Loci 

A simple interaction between TRF2 and NR2C2/F2 only in ALT(+) 
cells is unlikely because a large number of NR2C2/F2 sites 
remain TRF2 free. Moreover, we failed to detect any interaction 
between TRF2 and NR2C/F proteins by coimmunoprecipitation 
(data not shown). Additionally, the absence of telomeric motifs 
at NR2C2/F2 sites likely excludes a direct TRF2 recruitment. 
Thus, we hypothesized that physical interactions between ALT 
telomeric material and endogenous NR2C/F-binding sites occur 
via NR2C/F proteins (Figure 3A). Accordingly, NR2C/F proteins 
would bridge ALT telomeres and extra-chromosomal telomeric 
material generated by the recombination process (Cesare and 
Griffith, 2004) together and to endogenous NR2C/F regions. By 
carrying over telomeric material, TRF2 is most probably cross- 
linked by formaldehyde “in trans" at endogenous NR2C/F re- 
gions, resulting in the appearance of enrichment peaks for 
TRF2 throughout the ALT genome at NR2C/F binding regions. 



The same bridging feature would explain why telomeres exten- 
sively interact with each other in ALT cells. Classical methods 
to measure locus interactions like chromosome conformation 
capture (Dekker et al., 2002) are challenging for repetitive se- 
quences like telomeres. Thus, to test telomere bridging by these 
factors, we used super-resolution three-dimensional structured 
illumination microscopy (3D-SIM), which improves spatial reso- 
lution by a factor of eight (Gustafsson et al., 2008). We expressed 
the DNA-binding mutant NR2C2 protein in the SaOS-2 ALT cell 
line. This mutant acts as a dominant-negative for endogenous 
NR2C2 (Tanabe et al., 2007), stripping it off telomeres (Fig- 
ure SIC). In SaOS-2 cells, NR2C2 is the only orphan nuclear re- 
ceptor bound to telomeres, suggesting that no other NHR could 
compensate for NR2C2 loss. The expression of this mutant dis- 
rupts telomere-telomere interactions, as shown by super-resolu- 
tion microscopy (Figure 3B). The number of telomeric clusters is 
reduced, and this is accompanied by an increase in the number 
of single detectable telomere signals (Figures 3B-3E, S3A, and 
S3B). On the other hand, expression of the mutant form of 
NR2C2 in HeLa 1.2.11 ALT(-) line has no effect on telomere 
number (Figure S4C). Similar results were obtained upon simul- 
taneously knocking down NR2C1 , NR2C2, and NR2F2 (NR2C/F) 
proteins in the WI38-VA13 ALT cell line (Figures S4A and S4B). 
Next, to evaluate bridging between telomeric material and non- 
telomeric NR2C/F regions, we developed an independent assay. 
In this approach, the sub-nuclear localization of a fluorescent 
plasmid DNA harboring NR2C/F DRO-binding sites can be 
tracked (Figure S3C). If our bridging hypothesis is valid, this 
plasmid should be targeted to ALT(-i-) telomeres, which concen- 
trate NR2C/F proteins. In contrast, this plasmid should not 
co-localize with ALT(-) telomeres, which are devoid of NR2C/F 
factors. Indeed, despite the formation of large cytoplasmic ag- 
gregates in transfected cells, which were unavoidable, the 
plasmid is efficiently targeted to ALT(-r), but not to ALT(-) telo- 
meres (Figure S3C), whereas the control plasmid, not containing 
NR2C/F binding motifs, does not accumulate at telomeres. 
Moreover, this recruitment is NR2C/F dependent, as it disap- 
pears upon NR2C/F depletion by RNAi. Next, we measured 
whether NR2C/F tethering to a LacO transgenic locus non- 
homologous to telomeric sequences is sufficient to drive the 
proximity of that locus to ALT telomeres. We used a transgenic 
U2-OS ALT cell line containing a single LacO array (Robinett 
et al., 1996), in which we expressed a NR2C2-Lacl fusion pro- 
tein, able to bind to the LacO array in the absence of NHR binding 
motif. Tethering NR2C2-Lacl to LacO leads to the extensive co- 
localization of the array to telomeric clusters (85% co-localiza- 
tion, Figure 3F), consistent with the bridging feature of this factor. 
Unexpectedly, this also leads to the appearance of multiple 
LacO signals co-localized with telomeric signals (82% of cells 
were showing, on average, ten independent LacO foci), suggest- 
ing a dramatic instability of the LacO array upon interaction with 
ALT telomeres. Neither the tethering of GFP alone nor the 



(D) Distribution of NR2C2 peaks overiapping (+) or not (— ) with TRF2 peaks in ALT(+) ceiis. (Top) Pie charts dispiaying categories according to the nearest 
transcriptionai start site (promoter, TSS). (Bottom) Boxpiots showing the distribution of NR2C2 peaks to the nearest TSS. 

(E) Overrepresented motifs in NR2C2 peaks overiapping (+) or not (— ) with TRF2 peaks in ALT(+) ceiis (red curves display the average location of the motif around 
central peak positions, and values indicate the percentage of peaks containing the motif). 

See also Figures S1 and S2. 
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Figure 3. Locus Proximity Induced by NR2C/F Proteins 

(A) Model of NR2C/F-induced proximity oftelomeric and genomic sites. The same model applies to the telomere/telomere proximity necessary for recombination 
in ALT. 

(B) Telomere FISH showing “de-clustering” of telomeres (visible by 3D-SIM super resolution microscopy) and increased number of telomeric foci upon NR2C2- 
DN expression in SaOS-2 cells, suggesting that telomere/telomere interactions are dissociated. The arrowheads indicate single telomeric foci within clusters. 
Clusters were defined as single signals in the wide field mode, which could be resolved as at least two individual signals in super-resolution mode. Right panels 

(legend continued on next page) 
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tethering of NR2C2-Lacl in a transgenic HeLa ALT(-) celi iine 
harboring a singie LacO array ieads to its telomeric re-iocaliza- 
tion, or an amplification of LacO (Figures 3F and S3D). Aito- 
gether, our data demonstrate that NR2C/F binding is necessary 
for teiomere-teiomere proximity and for bridging bound ioci to 
teiomeres in the nuciear space. The ampiification of LacO in 
the NR2C2 tethering experiment aiso suggests that bridging to 
ALT teiomeres likely induces instabiiity of the co-iocaiized iocus. 

Telomere DNA Insertions at NR2C/F Regions in the 
Genome of ALT Cells 

Physicai interactions between transiocating ioci is a major 
requirement for chromosomai transiocations. In fact, transloca- 
tions are primariiy driven by the spatiai organization of 
chromosomes in the nucieus (Misteii and Soutoglou, 2009). 
Chromosomai architecture must be highly perturbed in ALT ceiis 
because ALT genomes show compiex chromosome rearrange- 
ments with muitipie heterogeneous transiocations (Guiiiou and 
Aurias, 2010; Jain et al., 2010; Lovejoy et ai., 2012). Since 
NR2C/F proteins can drive the physicai interaction of bound 
ioci and their apparent instabiiity, we asked whether this feature 
couid trigger instability at endogenous NR2C/F ioci. We hypoth- 
esized that the genomic NR2C/F-binding sites interacting with 
teiomeric materiai couid aiso be sites of teiomere sequence 
insertion. Teiomeric DNA insertions at discrete genomic sites 
shouid yield composite sequencing reads that faii to be aligned 
to the reference genome. Thus, we focused our anaiysis on 
sequencing reads that had mismatches, in both ALT(+) and 
ALT(-) ceii iines, ^90% of the input libraries contain perfectiy 
mappabie reads, suggesting comparabiy high sequencing quai- 
ity (Figure S2B). Likewise, reads mapping to repeated DNA are of 
comparabiy high quaiity ('^88% aiigned perfectiy). Fiowever, 
whiie teiomere reads from ALT(-) ceiis are aiso of high quaiity, 
teiomere reads from ALT(+) ceiis are more degenerated and 
^25% cannot be aiigned without aiiowing mismatches (Fig- 
ure S2B). To characterize this teiomere specific discrepancy, 
we analyzed TRF2 libraries in both ALT(+) and ALT(-) ceiis and 
examined the sequence organization of the reads that couid 
not be aiigned to the reference genome. This showed that, 
among aii possibie random hexamers in rearranged sequences, 
there is a striking bias for the canonicai GGGTTA and the variant 
GGGTCA motifs in ALT(+) sampies (Figure 4A). This indicates 
that, specificaily in ALT ceiis, reads containing these motifs are 
prone to rearrangements, regardiess of their iocation in the 
genome. We extended this anaiysis to NR2C2, NR2F2, and 
FiMBOXI iibraries. Reads in the NR2C2 and NR2F2 iibraries 
are even more degenerated than in the TRF2 and FIMBOXI li- 
braries (Figures S2C and S2D), suggesting that NFIR binding re- 



gions are intrinsicaiiy more unstabie. To get insights into the 
nature of these rearranged sequences, we analyzed their con- 
tent in details using the strategy depicted in Figure 4B. Strikingly, 
these rearranged sequences are composed of a part mapping to 
unique genome regions and aberrant random additions of 
GGGTCA and/or GGGTTA sequences (Figure 4B, bottom), sug- 
gesting that they can resuit from insertions of teiomeric DNA. 
These insertions do not occur at a precise position but are 
aiways iocated close to (<30 nt) DRO motifs. The systematic 
presence of DRO motifs in the non-rearranged portion of these 
reads points to the invoivement of NR2C/F proteins in targeting 
the iocai rearrangement between ALT teiomeric DNA and endog- 
enous NR2C/F-binding sites. The few ALT(-) sequences that 
couid not be mapped to the genome contain mostiy singie- 
nucieotide changes with no motif addition, suggesting that 
they did not arise from teiomeric DNA insertion, in ALT ceiis, 
the rearranged reads map to 23 distinct genomic regions of 
which 1 9 (~82%) corresponded to TRF2-positive NR2C/F peaks 
(Figure S6D). This indicates that oniy a smaii subset (19/473, 
~4%) of NR2C/F regions abie to recruit teiomeric materiai are 
in fact ioci for targeted teiomeric insertions (TTi) in ALT ceiis. 

NR2C/F-Driven Teiomeric DNA Insertions at DNA 
Double-Stranded Breaks Are Involved in Chromosomal 
Translocations in ALT Cells 

Artificiai insertion of teiomeric DNA inside genomes creates ITS, 
and this has been shown to promote chromosome rearrange- 
ments (Kiiburn et ai., 2001). Because iTS are potentiai common 
fragiie sites (Bosco and de Lange, 2012), addition of teiomeric 
DNA throughout the genome by TTI can be viewed as a source 
of genome instabiiity. As TTi paraiieis ALT, it must be an ongoing 
mechanism in proiiferating ceiis. To demonstrate that TTi is an 
active process in ALT ceiis, we tried to provoke teiomere 
sequence addition throughout the genome. To this aim, we 
induced DNA doubie-strand breaks (DSB) by drug treatments 
or Y irradiation and iooked for teiomere insertions at these sites 
by scoring the number of ITS signals on metaphase chromo- 
some spreads. Detectable ITS are more frequent in untreated 
ALT(+) than ALT(-) chromosomes (~5-fold), in iine with our 
sequencing data showing enrichment in teiomeric sequences 
at genomic sites in ALT(+) ceiis (Figure 5A). Upon DSB induction, 
they doubie in ALT(+). No change was observed in ALT(-) chro- 
mosomes, suggesting that breaks are normaiiy repaired without 
teiomeric sequences added in this ceii line (Murnane, 2012). in 
ALT ceils, ^30% of both pre-existing and newly formed ITS sites 
are aiso bound by NR2C2 and 39% by TRF1, indicating their 
teiomeric origin (Figures 5B and 5C). Consistent with previous 
data (Bosco and de Lange, 2012), these sites are potentiaily 



show the boxplot quantification of this effect. Top and bottom boxes show the first and third quartile around the mean, p values are from a two-sided Student's 
t test. 

(C) Loss of telomere clustering upon NR2C2 DN expression (aggregate). 

(D) Increase in detectable single telomere number. 

(E) Distribution of telomeres as individual or clustered signals as measured by super-resolution microscopy, p value from a two-sided Student's t test. 

(F) (Left) FISH in U2-OS cells harboring the LacO transgenic array. Cells were transfected either with GFP-LacI (top) or Flag-NR2C2-Lacl (bottom). (Middle) Chart 
displaying the co-localization of LacO with telomere signals. (Right) Chart measuring the extent of LacO signal amplification as counted by the number of in- 
dividual Lac signals in transfected cells. Right panel shows the boxplot quantification of this effect. Top and bottom boxes show the first and third quartiles around 
the mean, p values are from a two-sided Student’s t test. 

See also Figures S3 and S4. 
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fragile sites, as we observed increased breakage upon com- 
bined TRF1 knockdown and aphidicolin treatments (Figure S5A). 
TTI is NR2C/F dependent because no new ITS formed upon 
NR2C1, NR2C2, and NR2F2 silencing (Figure 5C). Moreover, 
NR2C/F-silenced cells have a significantly higher number of 
chromosomal fusions (arrowheads in Figure 5D), mostly without 
detectable telomeric signal. These fusions greatly increase upon 




Rearranged reads 



Figure 4. Degenerated Sequencing Reads 
in ALT Cells Contain Telomeric Motifs 

(A) Plot showing the frequency of degenerated 
reads (>3 mismatches) containing different hex- 
amers in ALT{— ) and ALT(+) TRF2 IPs. Each dot 
represents a different hexamer. Circled in red: 
hexamers that are abundant in ALT(+), but not in 
ALT(-). 

(B) (Top) Flowchart displays the strategy to 
identify ‘‘rearranged reads” (see Experimental 
Procedures). Briefly, non-mappable 50-nt-long 
reads were split in two 25-nt-long “trimmed” reads 
that were then individually mapped onto the 
genome. Non-mappable reads with at least one 
“trimmed” read that mapped uniquely were 
considered as “rearranged.” (Bottom) Represen- 
tative examples of rearranged and intact reads 
mapping near a putative OR-binding site (DRO 
underlined) of the Chr2 fusion ITS (reference 
genome sequence from the hg1 8 assembly on the 
top) retrieved from the TRF2 IP library in ALT (+) 
(top) and ALT(— ) (bottom) cells. Bold letters high- 
light mismatched nucleotides relative to the refer- 
ence sequence, and italic letters perfectly mapped 
nucleotides. Red and green boxes highlight 
GGGTTA and GGGTCA repeats, respectively, 
found in the reference genomic sequence (light 
color) or only in the rearranged reads (dark). As- 
terisks highlight rearranged reads identified also in 
the NR2F/C2 libraries. In the ALT(— ) panel, the 
arrows indicate single-nucleotide substitutions. 



DSB induction (up to 30% of fused 
chromosomes). This also suggests the 
involvement of NR2C/F proteins in pre- 
venting chromosomal fusions, a neces- 
sary condition for the maintenance of 
telomeric integrity in ALT cells. Similar re- 
sults were also obtained in another ALT(+) 
cell line and with another DNA-damaging 
agent (Figure S5B), demonstrating that 
telomere sequence addition to broken 
chromosomal sites is common in ALT. 
Moreover, because NR2C/F depletion 
has no major effect on TRF1 and TRF2 
levels (Figure S4A), this effect is unlikely 
due to a shelterin defect but suggests a 
protective role for NR2C/F factors on 
ALT telomeres. Since internal telomeric 
DNA has the inherent potential to form 
common fragile sites (CFS) in the human 
genome (Bosco and de Lange, 2012) 
and telomeres are fragile sites (Sfeir 
et al., 2009), we reasoned that TTI should have the potential to 
form translocations. Such translocations could leave telomeric 
DNA between two chromosomal segments. DSB induced by y 
irradiation in the 1/A 73 cell line resulted in 88 unique random 
break points on 693 analyzed chromosomes. 33% (29) of scored 
translocations have detectable telomeric signals at the translo- 
cation points between segments from distinct chromosomes. 
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as measured by SKY-FISH and telomere FISH on metaphase 
spreads (Figures 5E and S6 and Table S1). Out of these 29 
events, 13 do not involve terminal fusions, suggesting that TTI 
occurs frequently (-^15% of scored translocations) when cells 
are challenged. Thus, telomeric DNA likely participates in chro- 
mosomal translocations in ALT cells. 

Binding of NR2C/F Proteins to Telomeres Is a Hallmark 
of ALT in Tumors and Correlates with the Extent of 
Genome Rearrangements 

To directly explore the link between NR2C/F association to telo- 
meres, genome instability, and cancer in humans, we checked 
whether these proteins can also be found on telomeres in hu- 
man primary tumors. We analyzed 180 primary sarcomas from 
the “complexity index in sarcoma” (CINSARC) signature collec- 
tion (Chibon et al., 2010) by immunofluorescence and FISH on 
tissue microarrays (see Table S2 for tumors data). ALT nuclei 
contain structures called ALT-associated pro-myelocytic leuke- 
mia bodies (APB), in which the telomeric DNA is abnormally 
associated with the PML protein. The presence of APB is a diag- 
nostic marker of ALT (Henson et al., 2005). Accordingly, we 
scored 54.4% primary tumors analyzed as ALT(+) because 
these tumors have detectable APB. This is in line with the 
average ALT occurrence in sarcomas (Henson and Reddel, 
2010), validating our approach. The vast majority of ALT(+) tu- 
mors (^79%) also show telomeric accumulation of NR2F2 or 
NR2C2 (Figures 6A and 6B). We also analyzed healthy tissue 
sections surrounding 12 distinct tumors, and none (0/12) show 
NR2C2/F2 telomeric accumulation (Figure S7 and Table S3). 
Thus, NR2C/F telomeric accumulations are cancer specific 
and do not predate tumor development. Importantly, the extent 
of NR2C/F association to telomeres correlates with the tumor 
grade (25% in grades 1 and 2, versus 61% in grade 3) (Fig- 
ure 6C). As tumor grade is a strong indicator of genome 
complexity in sarcomas (Chibon et al., 2010), increased telo- 
meric accumulation of NR2C/F proteins mirrors increased 
genome rearrangements. This suggests the involvement of 
NR2C/F proteins in generating complex karyotypes in human 
sarcomas. 

DISCUSSION 

Telomeres are important protective chromosomal structures 
that safeguard the genome. When telomeres are deprotected, 
classical BFB cycles ensue and lead to genome alterations. 
These alterations include deletions and amplifications of chro- 
mosomal segments typically observed in tumors (Figure 7A). 
The BFB cycles, although probably arising as a consequence 
of the initial loss of important genome surveillance mecha- 
nisms, favor the acquisition of oncogenic mutations or accel- 
erate the loss of surveillance pathways that characterize 
transformed cells (Artandi et al., 2000). Importantly, BFB cycles 
are stopped when broken chromosomal extremities are 
healed by the addition of telomeres. Chromosomal healing is 
usually achieved by the reactivation of telomerase, which is 
involved in the creation of functional telomeres. Therefore, it 
is thought that the acquisition of telomere maintenance by 
telomerase reactivation stabilizes the transformed genome 



and favors an unlimited proliferation of selected transformed 
cells. Here, we describe another mechanism of telomere-driven 
genome instability that actually occurs as a consequence of the 
activation of aberrant telomere maintenance (Figure 7B). In 
contrast to the genome stabilization conferred by telomerase, 
we show that ALT activation also directly destabilizes the 
genome, using an unexpected mechanism that we name tar- 
geted telomere insertion (TTI). We propose that TTI contributes 
to the complex karyotypes found in tumors or cell lines in which 
ALT is activated. 

NR2C/F-Mediated Long-Distance Interactions 

The massive recruitment of orphan nuclear receptors at telo- 
meres in most ALT tumors or cell lines underlies a requirement 
to maintain a critical function. Consistently, loss of these pro- 
teins leads to defective telomere maintenance (Conomos 
et al., 2012; Dejardin and Kingston, 2009) and chromosomal fu- 
sions (Figure 5D). We found no evidence for a role of NR2C/F 
proteins in transcribing telomeres (data not shown), but we 
show here that this function is structural. By inducing the prox- 
imity of their binding loci, NR2C/F proteins promote physical in- 
teractions of telomeric material, a necessary requirement for 
recombination. An unexpected consequence of this bridging 
ability is that telomeric material is also able to physically interact 
with non-telomeric NR2C2/F2-binding sites throughout chro- 
mosomes. This represents a further confirmation that bridging 
is a major feature of NR2C/F proteins. Intriguingly, not all 
NR2C/F genomic sites have this ability. Telomere-genome inter- 
actions usually occur at NR2C/F regions located at a distance 
from genes, while promoters bound by NR2C/F do not seem 
to be involved. We believe that the regions able to contact 
telomeres might be enhancers because these elements are 
known to interact at long distance and organize local chromo- 
somal architecture (Smallwood and Ren, 2013). How would 
regions bound by the same transcription factors be located in 
close physical proximity? We can think of at least two possibil- 
ities: either these proteins bind to a shared machinery/structure 
available in limiting amounts, or these factors have the ability to 
engage into homotypic interactions. In line with this, RXR 
proteins, which belong to the NR2B family of NHR, have 
been shown to be able to oligomerize in vitro (Chen and Prival- 
sky, 1995). The biological significance of such architectural 
ability is not totally clear, but the clustering of co-regulated re- 
gions (presumably bound by the same factors) is a recurrent 
feature. A benefit of clustering/compartmentalizing nuclear 
transactions is to increase the local concentration of reactive 
species to ensure the robustness of biological processes (Dejar- 
din, 2012). 

Targeted Telomere Insertions: Implications for Genome 
Stability 

Another NHR, the androgen receptor (AR), was shown to drive 
the proximity of a subset of its target genes upon hormone in- 
duction (Lin et al., 2009; Mani et al., 2009). Combined with gen- 
otoxic stress, proximity is required to promote cancer-specific 
translocations. We show here that the bridging function 
conferred upon NR2C/F binding drives telomere sequence ad- 
ditions throughout the genome upon genotoxic stress. Thus, 
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Figure 6. Telomeric NR2C/F Correlate with 
Sarcoma Grades 

(A) Most APB-positive sarcomas samples show 
NR2C/F localization at telomeres. Immunofluo- 
rescence (IF)/telomere FISH in grade 2 leiomyo- 
sarcoma biopsies. (Top) APB scoring by co-local- 
ization of the FISH signal (red) to PML bodies 
(green). (Bottom) Localization of NR2C2 or NR2F2 
(green) at telomeres (red) in the same tumors. 

(B) Frequency of telomeric NR2C/F in APB(+) and 
in APB(— ) tumors based on the IF/FISH analysis, 
p value from a two-sided t test. 

(C) The frequency of NR2C2/F(+) telomeres in- 
creases with the tumor grade. See Extended 
Experimental Procedures for staining procedures. 
See also Figure S7 and Tables S2 and S3. 
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we believe that, in cancer cells, the bridging ability of transcrip- 
tion factors, ordinarily used to modulate gene expression, is 
frequently diverted to trigger chromosomal translocations. In 
contrast to the well-defined translocation that AR controls in 
prostate cancer, TTI yields heterogeneous rearrangements. 
We think that TTI contributes to the appearance and mainte- 
nance of complex mutator phenotypes at least at two levels: 
(1) the ongoing insertions of telomeric DNA at regulatory re- 



gions probably directly affect neigh- 
boring gene regulation, and (2) inserted 
telomeric DNA, which was shown to 
be prone to breakage, can contribute 
to elevated genomic instability. Our 
sequencing analysis could not allow us 
to measure the size of inserted telomeric 
DNA, but the detection of newly formed 
ITS by FISH (Figure 5C) suggests that 
these additions could extend to several 
kilobase pairs. Sarcomas in which ALT 
is active have been shown to systemati- 
cally harbor complex karyotypes with 
non-specific translocations (Montgomery 
et al., 2004; Scheel et al., 2001). 
Although we cannot exclude that other 
mechanisms could be involved, we pro- 
pose that TTI contributes to generating 
a subset of complex chromosomal rearrangements in these 
cancers. 

EXPERIMENTAL PROCEDURES 
Cell Culture 

The U2-OS, WI-38 VA132RA (VA13 throughout the text), and Saos-2 cell lines 
were obtained from ATCC. The HeLa 1.2. 1 1 cell line was kindly provided by 



Figure 5. Translocation Breakpoints Contain ITS in ALT Cells 

(A) (Top) Outline of the ITS induction assay; (bottom) boxplot showing the percentage of chromosomes with ITS signals in different conditions. Top and bottom 
boxes show the first and third quartiles around the mean, p values are from a two-sided Student's t test. 

(B) IF/FISH showing localization of NR2C2 (green) to a DSB-induced ITS signal (red) on a metaphase chromosome (VA13 cells). 

(C) IF/FISH showing localization of TRF1 (red) to a DSB-induced ITS signal (green) on a metaphase chromosome {VA13 cells). 

(D) (Left) Telomeric FISH on chromosome spreadsof Scr RNAi and triple knockdown 1/A73 cells after DSB drug treatment (bleomycin) showing chromosome fusions 
(arrows); mock, no treatment. (Right) Boxplot displaying induction of ITS sites in Scr RNAi and in triple of NR2C1 , NR2C2, and NR2F2 knocked down VA13 cells, 
treated or not with a DSB inducing drug. Top and bottom boxes show the first and third quartiles around the mean, p values are from a two-sided Student's t test. 

(E) Boxplot quantification of the fusion events. Top and bottom boxes show the first and third quartiles around the mean, p values are from a two-sided Student's 
t test. 

(F) (Upper-left) SKY-FISH combined with telomere (green) and centromere (red) FISH showing interstitial telomeric signals at the translocation points between 
chromosomes 1, 7, 9, 15, and 18 in the ALT+ VA13 cell line. (Upper-right) Graphical representation of the rearranged chromosome. (Bottom) Distribution of 
translocation events with or without ITS sites upon y irradiation, ter-ter, telomere-telomere translocations; ter-Cen, telomere-centomere translocations; gen-gen, 
telomere-genome translocations. 

See also Figures S5 and S6 and Table SI. 
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Figure 7. Comparison of TTI with BFB 

(A) Outline of the BFB cycles. Instability is stopped by the acquisition of telomerase. (B) Outline of the TTI. Instability is further enhanced by ALT. Insertions of 
telomeric DNA lead to the creation of potential fragile sites at endogenous NR2C/F binding regions. Two possible outcomes are highlighted upon breakage of 
these sites, intra- or inter-chromosomal rearrangements. 
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Titia de Lange. All were cultured in DMEM Glutamax (Life Technologies) 
supplemented with 10% FBS (Eurobio). U2-OS and HeLa cell lines, harboring 
single genomic insertions of the LacO array, were cultured with hygromycin B. 

Cell Immunofluorescence 

Staining was performed as described before (Dejardin and Kingston, 2009), 
using the following antibodies: anti-TRF2 (Abeam [13579]); anti-PML (Santa 
Cruz Biotechnology [sc-5621/sc-966]); anti-NR2C2 (PPMX [PP-H0107B-00]); 
anti-HMBOXI (Abeam [ab97643]); anti-NR2F2 (Abeam [ab50487]), anti-Flag 
(Sigma [F7425]). Secondary antibodies: Jackson ImmunoResearch (anti- 
rabbit DyLight 488 [711-485-152]; anti-mouse DyLight 488 [715-485-150]; 
anti-mouse DyLight549 [715-505-150]; anti-rabbit DyLight 549 [711-505- 
152]; anti-mouse DyLight649 [715-495-150]; anti-rabbit DyLight 549 [711- 
495-152]). 

IF/FISH and FISH 

After incubation with secondary antibodies, cells were cross-linked at 37°C in 
3.6% formaldehyde for 20 min and then incubated at 75°C in 2xSSC for 1 hr 
and in 0.1 M NaOH for 10 min and rinsed with 2xSSC. Then, cells were dehy- 
drated by successive 75% and 1 00% ethanol baths and air dried. Slides were 
incubated at 82°C with a telomeric-C PNA probe coupled with FAM or Cy3 
(Panagene [F1001/F1002]) for 2 min and then at 37°C for 12 hr. The probe 
was diluted in hybridization buffer (10% dextran sulfate, 50% deionized form- 
amide, 2xSSC) to the final concentration of 50 mM. Then, cells were washed 
with 2xSSC at 42°C for 20 min. Finally, slides were mounted in ProlongGold 
(Life Technologies). 

StructuraMllumination-Based Super-Resolution Microscopy and 
Telomere Clustering 

Telomeres were labeled with a PNA probe using the FISH protocol described 
above. Full z stacks were acquired for each nucleus using the lOOx objective. 
Telomeric signals were counted on Z projections of nuclei. If more than one te- 
lomeric signal could be distinguished within a telomeric focus, then such focus 
was scored as a telomeric cluster. 

Targeting LacO to Telomeres 

Cells were transfected either with GFP-LacI or with GFP-LacI and Flag- 
NR2C2-Lacl, and the array was visualized either with the GFP or byLacO-spe- 
cific FISH. Transfections were performed using the AMAXA nucleofector 
device using nucleofector reagent from Mirus according to manufacturers’ in- 
structions. Quantification of LacO localization to telomeres was performed 
48 hr after transfection. 

Transfections and Constructs 

We performed triple NR2C/F knockdown because neither single nor double 
knockdowns had a strong impact on ALT telomeres, probably because of 
redundant function (Dejardin and Kingston, 2009). The triple knockdown was 
performed with Stealth RNAi (Life Technologies) using Lipofectamine 
RNAiMAX (Life Technologies). The oligonucleotides were transfected twice 
within 48 hr. RNAi oligonucleotides used in the experiment were: 
siRNA negative control Med GC, NR2F2 [NR2F2MSS235955], NR2C1 
[NR2C1HSS1 10947], and NR2C2 [NR2C2HSS1 10950]. Flag-NR2C2 and 
Flag-NR2C2-DN were kindly provided by Osamu Tanabe. Constructs were 
transfected using the AMAXA Cell Line Nucleofector Kit R {VA13 cells) or Kit 
V [Saos-2 cells) according to the manufacturer’s instructions. 

ITS Induction 

Asynchronously growing cells were incubated with bleomycin (Calbiochem, 
30 mU/ml) or etoposide (Sigma, 10 jiM). After 2 hr, the drug-containing 
medium was replaced with fresh medium, and colcemid was added 43 hr after 
drug release to prepare chromosome spreads. We quantified the interstitial 
telomeric FISH signal on single and (rare) fused chromosomes. In triple 
Nr2c/f knockdown, because of the high frequency of chromosomal fusions, 
we quantified the interstitial telomeric FISH signal only from single (not fused) 
chromosomes to avoid counting fused chromosomes with residual telomere 
signal as ITS. 



Aphidicolin Treatment and TRF1 Knockdown 

Cells were treated with low doses (0.3 ).iM) of Aphidicolin for 24 hr. Aphidicolin 
was added 24 hr after a second TRF1 RNAi transfection (two RNAi transfec- 
tions within 48 hr). After 20 hr of Aphidicolin treatment, colcemid was added 
to medium to induce mitotic arrest. RNAi used forTRFI knockdown was pur- 
chased from Dharmacon (SmartPool RNAi). 

Western Blotting 

Nuclear extracts from transfected cells were run on 12% Bis-Tris gels (Life 
Technologies [NP0341]) and transferred to PVDF membranes using a liquid 
transfer system for 2 hr at 300 mA. The membrane was blocked for 30 min 
at room temperature in 1 x PBS containing 5% milk and then incubated with 
primary antibodies diluted 1/1 ,000 in the same buffer for 2 hr at room temper- 
ature, washed twice for 1 5 min in PBS-0.05%Tween 20, and incubated for 1 hr 
with secondary antibodies diluted 1/5,000 in 5% milk-PBS which was followed 
by two washes in PBS-0.05% Tween 20. Antibodies: aFlag (Sigma [F7425]), 
aNR2C2 (PPMX [PP-H0107B-00]), aHMBOXI (Abeam [ab97643]), aTRF2 
(Abeam [ab13579]), aTRF1 (Abeam [ab10579]), aNR2F2 (Abeam [ab50487]), 
aNR2C1 (Santa Cruz [sc-9087]), aPCNA (Santa Cruz [sc-25280]), anti-rabbit- 
HRP (Sigma [A0545]), anti-mouse-HRP (Sigma [A4416]). 

Chip Sequencing 

Cells growing in monolayer were cross-linked in 1 % formaldehyde/PBS for 
10 min, washed twice in PBS, and scraped in PBS/0.05% Tween. Then cells 
were pelleted, washed in PBS, and incubated at 4°C in lysis buffer 1 for 
10 min, then at room temperature in lysis buffer 2 for 10 min, dounced with a 
tight pestle, centrifuged, resuspended in lysis buffer 3 (1 ml/IP for ~2 x 
lO^cells) and sonicated (12 pulses of 70% power, 15” ON, 45” OFF using a 
Misonix sonicator) to obtain chromatin fragments of 200 bp. Subsequently, 
chromatin was pre-cleared at 4°C with 10 |il/ml BSA-blocked Dynabeads 
(Life Technologies, a mix 1:1 of protein A and protein G beads) for 30 min 
and incubated at 4°C with 5 |.ig antibody/IP overnight. Chromatin was then 
incubated with magnetic beads at 4°C for 2 hr. Beads were washed five times 
with RIPA and once in TE with 50 mM NaCI. Chromatin was eluted from the 
beads by incubating in elution buffer with shaking for 30 min. Cross-linking 
was removed by overnight incubation at 65°C. After RNaseA and proteinase 
K treatments, the DNA was extracted with phenohchloroform and ethanol pre- 
cipitation. Isolated DNA was resuspended in water. ChIP experiments were 
performed four to six times independently for each antibody. Libraries were 
cloned and sequenced by Fastens SA (Switzerland) using the lllumina strategy 
(HiSeq2000, single-end). 

Antibodies 

anti-TRF2 (SantaCruz Biotechnology, sc-9143); anti-NR2C2-PPMX (PP- 
H01 07B-00): anti-HMBOXI (Abeam, ab97643); anti-NR2F2 (Abeam, ab50487). 

Buffers Composition 

Buffers used for ChIP have been described previously (Lee et al., 2006). 

Bioinformatic Analysis 

Bioinformatic analysis is described in the Extended Experimental Procedures. 

SUPPLEMENTAL INFORMATION 

Supplemental information includes Extended Experimental Procedures, 
seven figures, and three tables and can be found with this article online at 
http://dx.doi.Org/1 0.101 6/j.cell.201 5.01 .044. 
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SUMMARY 

Telomerase is required for long-term telomere main- 
tenance and protection. Using single budding yeast 
mother cell analyses we found that, even early after 
telomerase inactivation (ETI), yeast mother cells 
show transient DNA damage response (DDR) epi- 
sodes, stochastically altered cell-cycle dynamics, 
and accelerated mother cell aging. The accelera- 
tion of ETI mother cell aging was not explained 
by increased reactive oxygen species (ROS), Sir pro- 
tein perturbation, or deprotected telomeres. ETI 
phenotypes occurred well before the population 
senescence caused late after telomerase inactiva- 
tion (LTI). They were morphologically distinct from 
LTI senescence, were genetically uncoupled from 
telomere length, and were rescued by elevating 
dNTP pools. Our combined genetic and single-cell 
analyses show that, well before critical telomere 
shortening, telomerase is continuously required to 
respond to transient DNA replication stress in mother 
cells and that a lack of telomerase accelerates other- 
wise normal aging. 

INTRODUCTION 

Telomeres, protective DNA-protein complexes at the ends of eu- 
karyotic chromosomes, buffer against the loss of sequence dur- 
ing DNA replication and distinguish normal chromosome ends 
from potentially dangerous double-strand breaks. Telomeres 
are composed of sequence-specific DNA binding proteins 
bound to highly repetitive DNA sequences and are increasingly 
recognized as genomic regions prone to replication stress (Miller 
et al., 2006; Sfeir et al., 2009; Drosopoulos et al., 2012). Without 
the telomeric DNA-elongating enzyme telomerase, progressive 
telomere shortening eventually causes the collapse of the pro- 
tective DNA-protein complex (deprotection), but this occurs 
only after many cell divisions, late after telomerase inactivation 
(LTI). In LTI cells, telomere deprotection shares many properties 
with classic DNA damage (Nautiyal et al., 2002; d’Adda di Faga- 
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gna et al., 2003) and induces a DNA damage response (DDR) and 
a permanent G2/M cell-cycle arrest (senescence). 

Previously, responses to telomerase deletion have generally 
been reported only after a significant delay (in S. cerevisiae, after 
^50-80 divisions). Thus, it was thought that cells sense altered 
telomere properties that signal senescence only when telomeres 
become critically short and deprotected. Hence, responses and 
phenotypes of cells early after telomerase inactivation (ETI) have 
not been extensively investigated. However, it was previously 
shown that, in ETI cells, very short telomeres appear at low fre- 
quencies that fuse to an induced double-strand break (DSB) 
(~10“^ to 10“^). These rare fusions became molecularly detect- 
able when telomerase was inactivated by either deletion of 
the telomerase RNA template TLC1 {tIcIA) or by replacing the 
reverse transcriptase subunit, EST2, with the mutant est2- 
D530A, which assembles a telomerase ribonucleoprotein 
enzyme complex lacking telomeric DNA polymerization activity 
(Chan and Blackburn, 2003; Lingner et al., 1997). These fuso- 
genic telomeres arose in ETI cells well before any signs of bulk 
population senescence and even if the telomeres had been 
pre-lengthened. Therefore, even the short-term absence of telo- 
merase activity causes cells to experience a low but detectable 
genomic instability. 

In a process distinct from the permanent bulk population cell- 
cycle arrest resulting from critically short telomeres in senescent 
LTI cells, an individual wild-type (WT) yeast mother cell will cease 
divisions after it has produced ^25 daughter cells. As of yet, there 
has been very little evidence suggesting interaction between 
the pathways that regulate these two kinds of aging, hereafter 
referred to as “LTI senescence” and “mother cell aging/lifespan,” 
respectively. Despite the identification of multiple genes that 
regulate mother cells lifespans (Bishop and Guarente, 2007; 
Johnson et al., 1 999; Kaeberlein, 201 0), the mechanisms causing 
mother cell aging of even WT yeast remain poorly understood. 

Here, we report experiments employing single cell methodol- 
ogies, supporting a model in which budding yeast mother cells 
lacking telomerase activity are less able to resolve replication 
stress inherent to telomeres. These cells show induction of a 
signaling pathway indicative of transient DNA replication stress, 
altered cell-cycle dynamics even in young mother cells, and 
accelerated aging (reduced lifespan), independently of telomere 
length. Our results demonstrate that this occurs well before the 
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Figure 1. ETI Mother Cells Show a Non-Progressive Cell-Cycle Length Phenotype and Reduced Lifespan that is Rescued by SML1 Deletion 

(A-D) Mother cell budding profiles for (A) WT, (B) tlc1A, (C) est2-D530A, and (D) ticl Asml1 A, showing cell-cycle durations and heterogeneity (see exponential 
color scale; cell cycles with durations 1 .4 hr or less were colored in purple). The x axis displays individual mother cells shown as vertical bars, with budding events 
indicated as horizontal white divisions. Mean lifespan for each genotype is presented in the upper-left corner of the plot. 

(E) ETI mother cells showed reduced replicative lifespans compared to WT cells (number of cells [n]: tlc1A, 354; est2-D530A, 117; WT, 234. p value for difference 
between tlc1 A and WT < 1e-37). 

(E) Deletion of SML1 restores lifespan of ETI mother cells to WT levels (n; tlc1 Asml1 A, 77; sml1A, 39). 

(G) The heterogeneity of cell-cycle lengths in ETI cells did not progressively worsen relative to WT as mother cells aged. Eold increase in cell-cycle variability from 
first and second to third and fourth last cell cycles compared for each genotype (shown below each set). The variance of first and second cell cycles oUldA and 
est2530A is significantly greater than that of WT (E test; p < 1 e-1 6 and 1 e-1 3, respectively). Error bars indicate SD. 

See also Eigures SI and S2. 



onset of LTI senescence and that the accelerated aging of ETI 
mother cells resembles the normal mother cell aging process. 

RESULTS 

Mother Cells Lacking Active Telomerase Show 
Increased Heterogeneity of Cell-Cycle Durations and 
Reduced Lifespans 

We analyzed the properties of individual haploid ETI mother 
cells, well before any signs of cellular LTI senescence, freshly 
isolated from sporulation of heterozygous telomerase-compe- 
tent diploids. Following genotyping, cells were taken from 
logarithmically growing cell cultures (-^25-30 generations after 
telomerase loss), in which the overwhelming majority of cells 
were robustly growing newborn or very young mother cells. 
These cells were placed in a microfluidic device, and the budding 
cycles and lifespans of individual mother cells were continuously 
monitored for 2 days by repeated microscopic imaging (Xie et al., 
2012; Zhang etal., 2012). 

First, even the youngest ETI mother cells {tlc1A or est2-D530A) 
immediately showed higher frequencies of stochastically longer 
and more heterogeneous cell-cycle durations than WT (Figures 
1A-1C; note especially between times 0 to 5 hr, as marked on 
Y axes). As the durations of the last two budding cycles were 



highly heterogeneous in both WT and ETI mothers, they were 
discarded from all cell-cycle duration analyses discussed here. 
This cell-cycle heterogeneity was consistent with observations 
of bulk ETI population budding kinetics, as manifested by cells 
lingering in the large-budded state (G2/M), enriched for cells 
with short spindles and unsegregated chromosomes (Figures 
SI A-S1 C). Second, we analyzed the mother cell aging of individ- 
ual ETI cells and found that the lack of telomerase activity 
reduced ETI mother cell lifespan. Mean budding lifespan for 
tlc1A was 12.6 (7 replicates) and 7.6 generations for est2- 
D530A (3 replicates), compared to 22.1 for WT mother cells 
(Figures 1A-1C and IE). Furthermore, the catalytically inactive 
telomerase est2-D530A point mutant showed even longer cell- 
cycle durations than tlc1A ETI mother cells, and the lifespan 
reduction was even more severe. Flence, lack of telomerase 
enzymatic activity, rather than the absence of an assembled 
telomerase ribonucleoprotein complex, causes increased cell- 
cycle heterogeneity and faster mother cell aging. 

Heterogeneous Cell Cycles Do Not Progressively 
Worsen with Shortening Telomeres 

If the extended, heterogeneous cell-cycle lengths of ETI mother 
cells were due solely to telomere shortening, we would have 
expected the phenotype to worsen progressively with each 
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successive cell division. However, this was not the case. First, as 
individual mother cells progressed from being very young to old, 
ETI mother cells did not show any significant progressive in- 
crease in mean duration or heterogeneity of mother cell-cycle 
lengths relative to WT (Figure 1G). Second, during the individual 
ETI mother cell lineages, a young mother cell whose initial cell cy- 
cle was long had no greater probability of having subsequent 
longer cell cycles or a shorter lifespan than one with an initial 
short cell cycle (Figure S2), supporting a stochastic and 
episodic, rather than progressive, nature of the occurrence of 
longer cell cycles. These highly stochastic episodes of cell-cycle 
heterogeneity and lack of any progressive worsening of this 
phenotype as ETI mother cells aged are not the predicted result 
of progressive telomere shortening. 




Figure 2. SML1 Deletion Rescues Mother Cell Lifespan of ETI Cells 
Independently of Telomere Length 

(A) SML1 deletion had no significant effect on the rate of bulk population 
senescence in ETI cells passaged on solid media to induce LTI-senescence. 

(B) Southern blot analysis of telomeric DNA restriction fragment lengths of cells 
taken from serial streaks shown in (A), using TG(i- 3 > repeat telomeric probe. ETI 
(tlc1A) and ETI smUA (tlc1 Asmil A) displayed similar rates of telomere short- 
ening, and the lower end of the telomere length distributions were similar. 
See also Eigure S3. 



SML1 Deletion Rescues Mother Cell Lifespan of ETI 
Cells Independently of Telomere Length 

Because we observed an extended G2/M phase in bulk popula- 
tion analyses (Figure S1), which is often the result of DDR activa- 
tion, we determined whether mutations affecting the DDR 
affected the above ETI phenotypes. Responses to various forms 
of DNA damage, including that sensed at critically short telo- 
meres in LTI senescence, involve a cascade of phosphorylation 
events, with early upstream steps occurring at the source of DNA 
damage through PIKK family member kinases Med {ATR) and/ 
or Tell {ATM). Strains lacking only Med are inviable, but this 
mec1A lethality can be rescued by deletion of SML1 (Zhao 
etal., 1998). Smil inhibits ribonucleotide reductase (RNR), which 
catalyzes the rate-limiting step in dNTP production (Reichard, 
1988). Deletion of Smil increases RNR activity and elevates 
dNTP pools, obviating the need for certain DDR components un- 
der healthy growing conditions, and can be protective against 
some forms of DNA damage (Andreson et al., 2010; Jossen 
and Bermejo, 2013). Strikingly, deletion oi SML1 in ETI tlc1A 
strains efficiently rescued the ETI-induced heterogeneity of 
budding cycle durations (Figure 1 D) as well as the shortening 
of mother cell lifespan (Figure IF). However, SML1 deletion alone 
produced no change in the rates of bulk telomere shortening in 
ETI cells, nor in the subsequent onset of LTI senescence (Figures 
2 and S3). We also confirmed that the deletion of SML1 alone 
caused no significant effect on mother cell lifespans and telo- 
mere length compared to WT (Figures 1 F and S4B). Hence, the 
dramatic rescue of ETI cell-cycle heterogeneity and accelerated 
mother cell aging by SML1 deletion cannot be explained 
by increased telomere length or by slower rates of telomere 
shortening. 

ETI Mother Cells Age with Terminal Cellular and 
Mitochondrial Morphologies Distinct from LTI 
Senescence but Similar to Those of Normal Mother Cell 
Aging 

We tested further whether budding cessation due to mother cell 
aging in ETI or WT cells was distinguishable from the G2/M arrest 
caused by LTI senescence by examining cell and mitochondrial 
morphology at the end of the lifespans (terminal morphology). 
Typical WT mother cell aging produces terminal cells that are 
mostly small budded with minimal or no mitochondrial fluores- 
cence signal from a mitochondrially localized GFP (mtGFP) 
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Figure 3. ETI Mother Cells Age with Terminal Cellular and Mitochondrial Morphologies Distinct from LTI Senescence but Similar to Those of 
Normal Mother Cell Aging 

(A) Three possible terminal death morphologies were observed in WT, ETI, and/or LTI (see text for definition) mother cells: small budded (type i), elongated (type ii), 
and a G2/M large-budded (“dumbbell” shape) (type ill). Mitochondrial volume was measured using mitochondrially localized GFP (mtGFP). 

(B) ETI and LTI populations of f/c7Jsm/7J cells were prepared to distinguish cell death from normal mother cell aging and LTI senescence. tlclAsmllA LTI (n = 49) 
strains senesced and showed reduced lifespan, as expected. 

(C) ETI cells terminally arrest in a manner similar to WT mother cells and distinct from LTI senescence. Most of the cells in ETI tlclA and ETI tlclAsmllA show type i 
or type ii death morphologies (> 95%), similar to terminal WT mother cells. In LTI cells, a major fraction (~70%) displayed type iii morphology (p < 1 e-4 compared 
to ETI tlclAsmllA by Fisher’s exact test), indicative of senescence induced by critically short telomeres. 



(Figure 3Ai) and a smaller population of elongated cells with 
brighter mitochondrial fluorescence (Figure 3Aii). In contrast, 
cells terminally arrested due to LTI senescence accumulate 
with a swollen, large-budded (“dumbbell”) morphology and 
with mitochondrial fluorescence that gradually forms very bright 
dots (Figure 3Aiii) (Nautiyal et al., 2002). We created and 
analyzed two populations of tlclAsmllA cells. The first popula- 
tion was isolated as soon as possible after genotyping (ETI) 
and was enriched for mother cells that would reach their aging 
limit prior to LTI senescence. The second population was 
passaged for approximately ten additional generations prior to 
microfluidics analysis, thus enriching for cells that would un- 
dergo LTI senescence (critically short telomeres) before the 
mother cells reached their aging limit (Figure 3B). Terminally 
aged ETI tlc1A and ETI tlclAsmllA mother cells accumulated 
mostly in two dominant terminal morphologies, which resembled 
the two dominant WT terminal morphology phenotypes (Xie 
et al., 2012) and only very rarely in the dumbbell morphology (Fig- 
ures 3A and 3C). In contrast, in terminal LTI tlclAsmllA mother 
cells, terminal dumbbell morphologies became the major type 
observed, indicating that a large proportion of the population 
had entered LTI senescence (Figures 3Aiii and 3C). These results 
support terminal cellular and mitochondrial morphology as an 
accurate distinction between LTI senescence and normal 
mother cell aging and provide further evidence that ETI mother 
cells cease divisions as a result of mother cell aging rather 
than LTI senescence. 

Mutation of Specific DDR Components Exacerbates ETI 
Cell-Cycle and Lifespan Phenotypes 

We investigated other proteins previously implicated in yeast 
telomere maintenance and in the DDR for effects on mother 
cell aging. Maintenance of yeast telomeres at normal length 
requires DDR kinases Med and Tell (Sabourin and Zakian, 



2008; Takata et al., 2004) and the replication stress-specific 
DDR adaptor protein Mrcl (Grandin et al., 2005). First, we found 
that cell-cycle heterogeneity and mother cell lifespan were 
similar in WT, tell A, and meet Asml1 A strains (Figures 1A, 4A, 
4C, S4C, and S5A). Because tell A in haploid cells reduces telo- 
merase action on telomeres, telomeres decline to a short length 
that is then stably maintained (Greenwell et al., 1995; Lustig and 
Petes, 1986). The tell A cells used here were isolated Immedi- 
ately after sporulation of heterozygous parent diploids and 
analyzed when telomeres were still shortening from near-WT 
lengths. Therefore, having telomeres that are shortening but 
eventually stably maintained is not alone sufficient to alter cell- 
cycle duration and lifespan. 

Next, we examined how mutations of Med and Tell affect the 
ETI phenotypes. Because sml1A, as shown above, efficiently 
rescues the accelerated aging of ETI mother cells, it is difficult 
to determine whether Med has a role in this process, due to 
the necessity of deleting SML1 for viability in meet A strains. 
Flowever, ETI tlc1 Atel1 A double-mutant mother cells had even 
greater cell-cycle heterogeneity and shorter budding lifespan 
(mean 9.8 generations, 2 replicates) than control ETI tlc1A sin- 
gle-mutant mother cells (Figures 1 B, 4B, and 4D). As shown pre- 
viously, freshly isolated ETI haploid cells that are also mutated for 
Tell or Med {tlc1 Amed Asml1 A or tlc1Atel1A) have a rate of 
initial telomere shortening and progression to LTI population 
senescence similar to tlc1A single mutants (Chan and Black- 
burn, 2003) (Figures 4E and 4F). ftence, the exacerbation of 
the ETI cell-cycle heterogeneity and lifespan reduction pheno- 
types caused by lack of Tell is not explained by faster telomere 
shortening or accelerated population senescence. 

Because sml1A rescues the cell-cycle and lifespan pheno- 
types of tlc1A mother cells and is known to facilitate DNA repli- 
cation by increasing nucleotide levels (Chabes et al., 2003), we 
suspected that ETI cells may be more vulnerable to telomeric 
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Figure 4. TEL1 Deletion Exacerbates ETI Cell Cycle and Lifespan Phenotypes, but Not Senescence or Telomere-Shortening Rates 

(A and B) Mother cell budding profiles for tell A (A) and tlc1A tell A (B). 

(C) tell A (n = 38) strain lifespan does not differ from WT. 

(D) tIclAtellA (n = 83) mutation worsens the lifespan reduction caused by ETI mutations in mother cells (p < 1e-4, compared with tlc1A alone). 

(E) ETI and ETI tell A mutants displayed similar rates of senescence when passaged on solid media. 

(E) Southern blot analysis of telomeric DNA restriction fragment lengths of cells taken from plates after serial streaks shown in (E). 

See also Eigures S4 and S5. 



DNA replication stress. Therefore, we mutated the DDR adaptor 
protein Mrc1 , which is required specifically for the DNA replica- 
tion stress checkpoint (Alcasabas et al., 2001; Osborn and El- 
ledge, 2003) and has a minor role in telomere length maintenance 
(Tsolou and Lydall, 2007). Mutation of 1 7 potential PIKK family ki- 
nase consensus phosphorylation sites on Mrcl (mrcl^^ allows 
full cell viability but disables the DNA replication stress response 
(Osborn and Elledge, 2003). Despite the lack of any mother cell 
lifespan or cell-cycle effect of mrc1‘^° alone (Figures 5A and 
50), tlc1Amrc1^° double-mutant ETI mother cells showed 
even greater cell-cycle length heterogeneity than the tlc1A sin- 
gle-mutant ETI cells (Figure 5B). Consistent results were also 
seen in the G2/M durations in bulk populations (Figure SID), 
and mean lifespan was markedly reduced to 8.8 generations (2 
replicates), compared with 12.6 generations for the control 
tlc1A ETI strains (Figures IB and 5D). These effects were not 
explainable by reduced telomere length or accelerated senes- 
cence, as the mrc1‘^^ mutant allele produced stable telomeres 
only slightly shorter than WT and had no effect on the kinetics 
of telomere shortening or bulk population senescence (Figures 
5E and 5F). We also tested the epistasis relationship of tell A 
and mrc1‘^^ in the ETI context. ETI triple-mutant tlc1A tell A 



mrd^'^ cells showed the same lifespan shortening as the double 
ETI mutants (Figure S5B). We conclude that Tell and Mrcl 
checkpoint functions act in the same pathway and that lack of 
either one acts synthetically with the ETI mother cell phenotypes. 

In the DDR cascade, downstream of Tell or Med, the DDR 
adaptor protein Rad9 can act semi-redundantly with the adaptor 
protein Mrcl. Mrcl is specifically involved in the replication 
stress response while Rad9 is mostly important for responding 
to DNA breaks and other DNA damage. In contrast to 
ticl Amrc1^° ETI cells, tlc1Arad9A ETI mother cell-cycle dura- 
tions and lifespans were not significantly different from tlc1A 
ETI cells, consistent with bulk population analyses (Figure SI 
and data not shown). The ETI tlc1A rad9A mother cells had a 
mean lifespan of 1 6.5 generations (2 replicates), while the control 
tlc1A strain had a mean lifespan of 1 3.7 generations (Figure S5C). 
Thus, rad9A did not significantly affect the accelerated aging 
phenotypes of ETI mother cells. These results confirmed the 
specificity of the Mrcl checkpoint function in the ETI mother 
cell phenotypes and indicate the involvement of a DNA replica- 
tion stress response, rather than a response to other forms of 
DNA damage, which requires Rad9. In summary, disrupting the 
DDR via tell A or mrc1^° mutations, but not by mec1 AsirtH A 
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Figure 5. MRC1 Mutation Exacerbates ETI Cell Cycle and Lifespan Phenotypes, but not Senescence or Telomere-Shortening Rates 

(A and B) Mother cell budding profiles for mrc1^°{A) and tlc1A mrc1^° (B). 

(C) mrc1^° (n = 40) strain lifespan does not differ from WT. 

(D) tlc1 Amrc1^° (n = 90) mutation worsens the lifespan reduction caused by ETI mutations in mother cells (p < 2e-7, compared with tlc1A). 

(E) ETI and ETI mrc1^° mutants displayed similar rates ot senescence when passaged on solid media. 

(E) Southern blot analysis of telomeric DNA restriction fragment lengths of cells taken from plates after serial streaks shown in (E). 

See also Eigures S4 and S5. 



or rad9A, strongly exacerbated the cell-cycle abnormalities and 
acceleration of mother cell aging in ETI cells, independently of 
telomere length and without accelerating LTI senescence. 

ETI Mother Cell Phenotypes Are Not Caused by 
Deprotected Telomeres 

Previous results showed that short, fusogenic telomeres occur 
spontaneously at very low frequencies in ETI cells (Chan and 
Blackburn, 2003). These fusogenic telomeres derive from rare in- 
dividual deprotected telomeres and can be detected by PCR as- 
says upon their fusion to an induced DNA double-stranded 
break. We tested whether the amount of such fusogenic telo- 
meres correlated with the severity of our ETI mother cell pheno- 
types using the same system (Chan and Blackburn, 2003) for 
semiquantitative PCR analyses. In agreement with the published 
work, we found that single-mutant ETI [tIcIA) and teHA strains 
each showed detectable but low amounts of fusions resulting 
from a deprotected telomere fusing to an induced DSB and 
that tlc1Atel1A strains showed a synergistic increase (Figures 



6A and 6B). However, in t!c1 Amrc1^° ETI cells (Figure 6A), the 
mrc1^° mutation produced no further significant increase over 
a t!c1A single mutant. Furthermore, sml1A did not reduce (and 
possibly increased) the number of fusogenic telomeres detected 
(Figures 6A and 6B). This complete non-concordance in these 
various ETI genotypes with the phenotypes we have observed 
here in ETI mother cells argues strongly against deprotected 
telomeres as a cause for the exacerbated cell-cycle heterogene- 
ity and accelerated mother cell aging. 

Further evidence that ETI phenotypes are not caused by 
deprotected telomeres, which induce a robust DDR (Nautiyal 
et al., 2002; d’Adda di Fagagna et al., 2003), came from 
comparing the genetic dependencies of ETI cell phenotypes 
versus DNA damage sensitivity. As previously reported, 
mec1 Asml1 A and rad9A mutations made cells highly sensitive 
to treatment with various classic DNA damaging agents 
(HU, UV, phleomycin, or MMS) (Figure S4D). This is in dra- 
matic contrast to the experiments described above, in which 
mec1 AsmH A ar\6rad9A did not exacerbate the ETI phenotypes. 
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Figure 6. Genotype Dependence of Telo- 
mere Fusions and Transient DNA Damage 
Response Episodes in Mother Cells 

(A) Semiquantitative PCR of DNA species resulting 
from the fusion of a deprotected telomere with an 
induced double-strand break in genetic back- 
grounds containing ETI and mrc1^° mutant com- 
binations. 

(B) Same as in (A) but with genetic backgrounds 
containing ETI and tell A mutations. 

(C) Two representative profiles of RNR3-GFP 
peaks occurring in still-dividing individual mother 
cells. Cell divisions (green diamonds) and RNR3- 
GFP reporter levels (blue circles) were plotted 
throughout an individual mother cell's lifespan. 
Spline fitting is shown as red lines. 

(D) Frequencies of RNR3-GFP induction peaks in 
still-dividing cells such as those shown in (C). *p < 
0.01 and **p < 0.001 by Fisher’s exact test are 
indicated. 

(E) Two representative mother cell profiles shown, 
as in (C), with cells displaying terminal RNR3-GFP 
induction peaks. 

(F) Frequencies of RNR3-GFP induction peaks in 
terminal mother cells, such as those shown in (E). 
See also Eigure S6. 




Hence, the genotype dependencies of ETI mother cell pheno- 
types are quite distinct from the dependencies of responses to 
ciassic DNA-damaging agents. 

Altered Recombination Levels Are Not Responsible for 
ETI Mother Cell Phenotypes 

Recombination is another process that has been impiicated in 
maintaining yeast teiomeres and occurs when teiomeres iose 
protection, such as in LTI ceiis (McEachern and Biackburn, 
1996; Basenko et ai., 2011). Foiiowing the onset of LTi senes- 
cence, Rad52-dependent recombination at teiomeres aiiows a 
smaii fraction (~10“"*) of senescing LTI yeast cells to survive 
and continue dividing (Lundblad and Blackburn, 1993). Also, 
DNA replication stress can be relieved by mechanisms involving 
recombination. We therefore asked whether recombination 
plays any role in the ETI accelerated mother cell-cycle kinetics 
and aging response. Deletion of RAD52 alone causes no 
changes in telomere length maintenance, and telomeres in 
tlc1 Arad52A strains shorten no faster than with tlc1A alone 



(Lundblad and Blackburn, 1993). How- 
ever, rad52A alone caused increased 
mother cell-cycle duration heterogeneity 
(data not shown) and an acceleration of 
mother cell budding aging (Park et al., 
1999). Notably, these rad52zl phenotypes 
were not substantially rescued by SML1 
deletion (mean lifespan, rad52A'. 9.4, 
n = 130 versus rad52Asml1 A-. 13.2, n = 
70) (Figure S6A and data not shown). 
Furthermore, the mean lifespan of ETI 
tlc1 Arad52Asml1 A mother cells was 
even lower than rad522ls/T7/7 /I: 8.2 versus 
13.2 (Figure S6A). Hence, lack of Rad52 function appears to 
act additively to the effect of TLC1 deletion. This epistasis rela- 
tionship indicates that absence of telomerase activity and of 
Rad52 each causes acceleration of mother cell aging but by 
two distinct mechanisms. 

ETI Phenotypes Are Not Caused By Relocalization of Sir 
Proteins 

Another pathway previously implicated in yeast mother cell 
aging involves changes in Sir protein concentration and localiza- 
tion. For example, Sir2 overexpression has been shown to in- 
crease mother cell lifespan (Kaeberlein et al., 1999). However, 
several lines of evidence argue that Sir2 sequestration in ETI 
cells does not explain their accelerated aging. First, all of our 
ETI strains mated normally, implying that the mating type loci 
were still silenced and arguing against a large relocalization of 
Sir proteins. Second, localized puncta of Sir3-GFP, indicative 
of telomere-bound Sir complex proteins (Martin et al., 1999), 
were not significantly different between ETI and WT mother cells 
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(Figure S6B). Third, although a single induced unrepairable DNA 
break has been reported to cause Rad9-dependent delocaliza- 
tion of Sir2 from telomeres (Martin et al., 1999; Mills et al., 
1999), as described above, rad9A neither exacerbated nor 
significantly rescued the accelerated aging in ETI mother cells. 
Together, these findings indicate that altered sequestration of 
the Sir complex is not the mechanism causing the accelerated 
aging of ETI cells. 

Lifespan Reduction of ETI Mother Cells Is Not Caused by 
Increased Reactive Oxygen Species 

We re-examined the previously described transcriptional profile 
data sets (Nautiyal et al., 2002; Table S2, passage 1) of ETI 
t!c1A cells. In an unbiased approach, we compared the large 
available number ofyeast gene expression profiles, measured un- 
der different environmental conditions and genetic backgrounds 
(Edgar et al., 2002), to that of ETI cells (Tables S3, S4, S5). The 
top hit (Pearson correlation = 0.495, p < 1e-254) was treatment 
with diamide, a thiol-oxidizing agent that causes oxidative stress. 
Because intracellular ROS have long been theorized to play a role 
in aging and ROS levels can be elevated as a result of DNA dam- 
age (Rowe et al., 2008; Salmon et al., 2004), we tested whether 
oxidative stress caused the ETI mother cell phenotypes. 

We assessed oxidative stress in ETI cells by quantifying ROS 
levels in our strains. If the accelerated aging of ETI cells is caused 
by higher intracellular ROS levels, it would be predicted that ETI 
t!c1A mother cells would have higher ROS than WT and that 
ETI t!c1A smil A cells would have lower ROS levels than ETI single 
tlc1A mutants. However, ETI cells did not have significantly 
higher levels of ROS than WT (Figure S6C). Furthermore, sml1A 
and tlc1Asml1A strains showed even higher levels of ROS, the 
opposite effect from that predicted if ETI causes faster aging of 
cells via higher intracellular ROS level. We also tested the effects 
of anti-oxidants by treatment with N-Acetyl-L-Oysteine (NAO). 
However, NAO equally and only modestly lengthened mother 
cell lifespans of both ETI and WT mother cells (Figure S6D). 
Hence, we conclude that, even though the transcriptional profile 
changes in ETI cells include features of an oxidative stress 
response, increased ROS levels and oxidative stress are not a pri- 
mary cause of accelerated mother cell aging elicited by ETI. 

ETI Cells Show Transient RNR3 Upregulation during 
Mother Cell Divisions 

Given the connections between DNA damage and cell-cycle 
regulation, we turned to a more detailed analysis to establish 
whether DDR signaling was induced in ETI mother cells. Notably, 
phosphorylation of DDR components such as Rad53 and Mrcl is 
only detectable in LTI cells and not in ETI cells (Grandin et al., 
2005) (Figure S4A). Therefore, we employed a more sensitive sin- 
gle-cell monitoring method to detect evidence for DDR signaling. 
We examined DDR activation during mother cell aging using a 
GFP-tagged allele of RNR3, a gene that is strongly induced as 
a downstream component of the DDR. We monitored GFP inten- 
sity during mother cell lifespan assays and quantified RNR3-GFP 
peaks (at least 1 .3-fold above background) using strains con- 
taining RNR3-GFP and all relevant combinations of tlc1A, 
sml1A, mrc1‘^°, and tell A mutations. Peaks were classified as 
occurring before the last two cell divisions or during/after the 



last two divisions, referred to here as “still-dividing” and “termi- 
nal” peaks, respectively (Figures 6C and 6E). Terminal peaks 
were scored as peaks per mother cell, and still-dividing peaks 
were scored as peaks per cell cycle as mother cells underwent 
a different average number of cycles depending on genotype 
(Figures 6D and 6F). In WT mother cells (two replicates), a tran- 
sient RNR3-GFP peak appeared at a low frequency during the 
cell cycles of still-dividing mother cells (0.0062, 95% confidence 
limits 0.0031-0.0111; n = 11 events in 1,766 cell cycles) and 
terminal peaks occurred in 25.0% of mother cells (24/96 cells, 
95% confidence limits 0.1736-0.3456). In contrast, in tlc1 A 
ETI mother cell lineages (three replicates), RNR3-GFP peaks 
occurred at significantly greater frequency in still-dividing 
mother cells (0.0177, 95% confidence limits 0.0123-0.0252; 
n = 30 events out of 1 ,696 cell cycles, p < 0.0025 compared 
with WT), and in 27.7% of terminal mother cells (44/159 cells, 
95% confidence limits 0.213-0.351). Hence, in still-dividing 
mother cells, ETI elicits an increased number of transient epi- 
sodes of DDR signaling. 

Notably, relative to tlc1A single mutants, RNR3-GFP peaks in 
tlc1Asml1A ETI mother cells (three replicates) were significantly 
diminished in frequency in the still-dividing mother cells (0.01 04, 
95% confidence limits 0.0071-0.0152; n = 27 events in 2,587 cell 
cycles, p < 0.05 compared with tlc1A), and terminal peaks 
occurred in only 13.1 % of mother cells (22/168 cells, 95% con- 
fidence limits 0.087-0.191, p < 0.01 compared with tlc1A). This 
result is explainable, as sml1A raises nucleotide pools, making 
replication fork stalling less likely to occur (Andreson et al., 
2010; Jossen and Bermejo, 2013) and hence reducing the pos- 
sibility of eliciting a DNA replication stress response. 

ETI tlc1Atel1A mother cells (two replicates) showed fewer 
RNR3 peaks than WT (or tlc1A ETI) in still-dividing cells (0.006, 
95% confidence limits 0.0034-0.0124; n = 10 events in 1,505 
cell cycles, p < 0.006 compared with tlc1A), and peaks occurred 
in 1 2.0% of terminal mother cells (1 5/1 25 cells, 95% confidence 
limits 0.0730-0.1897, p < 0.009 compared with tlc1A). This 
finding indicates that abrogating Tell greatly exacerbated the 
ETI mother cell aging phenotypes (Figures 4B and 4D) while 
reducing RNR3 induction events. We propose that the optimal 
response to the replication stress in tlc1A ETI cells requires 
Tell checkpoint function to activate DDR signaling, monitored 
here as downstream RNR3 induction. 

Interestingly, the t!c1 Amrc1^° ETI mother cells showed signif- 
icantly more RNR3 peaks than WT cells in both still-dividing 
(0.0193, 95% confidence limits 0.0127-0.0289; n = 23 events 
in 1,192 cell cycles) and terminal mothers (32.4%, 35/108 cells, 
95% confidence limits 0.243-0.417) and also trends to more 
RNR3 peaks than ttcIA (three replicates). Because the mrcT^^ 
mutation exacerbates the ETI cell-cycle duration and lifespan 
reduction phenotypes, this finding indicates that telomeric repli- 
cation stress requires Mrcl checkpoint function in order to elicit 
an appropriate response in the absence of telomerase, but not 
for induction of the RNR3 reporter. 

As the Rad9 adaptor protein is semi-redundant with Mrcl in 
the DDR cascade, we investigated whether Rad9 is required 
for induction of RNR3 in the absence of Mrcl checkpoint func- 
tion. However, when combined with the tlc1A mutation, the 
mrc1^° rad9A double mutation induced rapid lethality of the 
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bulk ETI cell population, precluding mother cell analyses. 
Remarkably, this loss of viability in tlc1 Amrc1^^rad9A cells 
was also completely rescued by smUA (data not shown). We 
conclude that, if ETI mother cells lack either Tell or Mrc1 check- 
point function, the response to, or repair of, telomeric DNA repli- 
cation stress-induced damage is compromised. 

The Degree of Heterogeneity of Cell-Cycle Durations 
and Mitochondrial Changes in Young ETI Mother Cells 
Each Quantitatively Predict Lifespan 

Strikingly, for each mother cell genotype described above, the 
frequency and degree of lengthened cell cycles in young mother 
cells predicted the degree of reduction in mean lifespan (Figures 
S7A and S7B). Furthermore, for both WT and ETI mother cells, 
the extent of the mitochondrial fluorescence quantified at a given 
early time point in the budding lineage (4 hr) predicted the life- 
span of that particular mother cell (Figures S7C and S7D). The 
finding that these relationships held across multiple genotypes 
suggests that responses that occur in even the youngest mother 
cells are caused by the same problem that eventually regulates 
the lifespan of the cell. 

DISCUSSION 

Flere, we have shown that lack of active telomerase affects yeast 
mother cells much earlier than expected, well before any effect 
on cells that can be attributed to critical telomere shortness. 
Notably, early telomerase inactivation in yeast mother cells 
caused increased heterogeneity of the cell cycle and accelerated 
aging. These phenotypes were rescued by increasing nucleotide 
pool levels and were sensitive to inactivation of specific DDR 
components. By several criteria, the ETI mother cell aging 
phenotype is consistent with an acceleration of normal mother 
cell aging processes and not senescence caused by loss of telo- 
mere protective function. These criteria included terminal cell 
and mitochondrial morphologies that were characteristic of ag- 
ing WT mother cells and distinct from those in senescent cells. 

Previously, it was thought that telomeres had to become 
critically short in order to elicit a cellular DDR. In contrast, our re- 
sults suggest that, independent of critical telomere shortness, 
ETI cells initiate signaling that accelerates an otherwise normal 
mother cell aging pathway. In the ETI setting, across multiple ge- 
notypes, the premature onset of mother cell aging is anticipated 
by the frequency and severity of stochastically slower cell- 
cycling events occurring in even young mother cells. In addition, 
we have shown that it is a lack of telomerase activity, rather than 
the lack of an assembled telomerase complex, that is the prox- 
imate cause of the response. Therefore, the action of telomerase 
on telomeres appears to be the most likely molecular property 
whose alteration causes these effects in ETI cells. 

Our combined findings provide evidence for the model shown 
in Figure 7. In this model, during mother cell divisions in ETI cells, 
lack of telomerase activity may eliminate a potential bypass 
mechanism for replication stress in the telomere. This causes 
transient episodes of a much milder DDR than the robust and 
sustained DDR elicited by critically short telomeres (Nautiyal 
et al., 2002; d’Adda di Fagagna et al., 2003). We propose that 
deletion of Sml1, via its known phenotype of increasing dNTP 



levels, alleviates this replication stress at an upstream level, pre- 
venting DNA damage signaling. This explains why Sml1 deletion 
suppressed both the transient DDR signaling and the acceler- 
ated aging in ETI mother cells. We propose that such replication 
stress arises often in telomeres and is sensed by Tell , which is 
required to cause the observed transient rises in RNR3 levels 
in ETI cells. This response promotes either repair or tolerance 
of the replication stress that allows ETI cells to bypass it and 
enter the next cell cycle. In ETI cells lacking the checkpoint func- 
tions of Tell or Mrc1 {tlc1Atel1A or tlc1A cells cannot 

activate the appropriate response to this replication stress, 
thus exacerbating the ETI phenotypes. Without Mrc1 checkpoint 
function, compensation by the semi-redundant adaptor Rad9 
occurs to some degree, but the telomeric replication stress is 
not fully resolved and further damage can ensue. Flowever, in 
the absence of telomerase, even a fully functional DDR is not suf- 
ficient to fully alleviate telomeric replication stress and prevent 
accelerated mother cell aging. 

We determined that neither ROS, recombination, deprotected 
(fusogenic) telomeres, redistribution of SIR protein complexes, 
nor a DDR similar to that in response to classic DNA damaging 
agents can account for the accelerated mother cell aging of 
ETI cells. Deletion of SML1 rescued the cell cycle, lifespan, 
and DDR (RNR3) induction phenotypes in dividing ETI mother 
cells and is known to suppress replication fork stalling (Andreson 
et al., 2010; Jossen and Bermejo, 2013). The majority of mutant 
phenotypes known to be suppressed by sml1A are related to 
DNA replication, including replication fork progression. This sug- 
gests that suppression by sml1A is very specific and occurs 
through elevated nucleotide pools via the release of inhibition 
of the RNR complex. Furthermore, the transcriptional profile of 
ETI tlc1A cells indicates that they upregulate RNR2, 3, and 4 
gene expression (Nautiyal et al., 2002) (Table S2). Taken together 
with previous findings, our results suggest that the higher nucle- 
otide pools in sml1A cells prevent telomeric replication stress 
from occurring, thus suppressing the ETI phenotype by prevent- 
ing any need for DDR activation or telomerase intervention. 

Telomerase is predicted to be recruited to backtracked repli- 
cation forks resulting from stalling, which has been proposed 
to occur at measurable frequencies in telomeric DNA (Miller 
et al., 2006; Drosopoulos et al., 2012). Such backtracked 
forks will expose single-stranded leading-strand TG(i_ 3 ) repeat 
sequence DNA, which is the substrate for telomerase elongation. 
Also, telomerase could aid in the repair of a broken telomeric fork 
generated when a stalled replisome collapses (Chang et al., 
2009). The resulting shortened telomere is a known preferred 
telomerase substrate (Miller et al., 2006). Other known inter- 
actions between DNA polymerase and telomerase actively 
engaged on telomeres may normally be required to optimize 
fork movement or fork restarting in telomeres. For example, 
the telomere-binding Cdc13-Stn1-Ten1 complex interacts via 
its Odd 3 subunit with a subunit of the telomerase complex 
(Esti) and also interacts (via Stnl) with a subunit of DNA poly- 
merase alpha (Gross! et al., 2004). 

The causal mechanism underlying mother cell aging remains 
unknown even for WT yeast despite extensive identification of 
genetic and environmental modifiers of this process. Our find- 
ings indicate that telomerase functionality is required throughout 
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Figure 7. Proposed Model 

(Upper-left) Proposed signaling interactions that regulate aging in response to telomeric DNA stress. Deleterious effects shown in red. (Upper-right) In ETI cells 
lacking Sml1 [tld Asml1 A), telomerase cannot alleviate replication stress. However, due to elevated dNTP levels, replication stress is prevented and aging is not 
accelerated. Eliminated or reduced signaling is shown in gray. (Lower-left) In ETI cells lacking Tell {tld Atel1 A), as well as lacking telomerase rescue, the DDR 
response is unavailable to alleviate replication stress, as indicated by the elimination of RNR3 signaling, and ETI-accelerated aging is exacerbated. (Lower-right) 
In ETI cells lacking Mrc1 function {tld Amrd'^^), telomerase rescue is unavailable and the DDR response to telomeric replication stress is partially hindered. Rad9 
is able to partially compensate and induce RNR3 induction, but other downstream DDR targets cannot be induced, thus exacerbating the ETI-accelerated aging. 
See also Figure S7. 



the divisions of yeast mother cells in a more continuous mode 
than previously thought. The findings reported here indicate 
that telomerase activity is required to alleviate normal telomeric 
replication stress and allow mother cell aging to occur with 
wild-type kinetics. In addition, mutations known to inhibit telo- 
merase activity or telomere maintenance have been implicated 
in the premature onset of diseases of aging and reduced lifespan 
in humans and mice (Codd et al., 201 3, Armanios and Blackburn, 
2012), and replication stress has been shown to induce aging in 
mouse cells (Flach et al., 201 4). Therefore, this early requirement 
for active telomerase in preventing premature mother cell aging 
in yeast suggests a new possibility: that loss of telomerase 
activity may have telomere-length-independent consequences 
that accelerate aging and cause aging-related diseases in other 
eukaryotes. 

EXPERIMENTAL PROCEDURES 

See Exten(ded Experimental Procedures for supplemental experiments: bulk 
population budding and chromosome segregation kinetics, NAG treatment, 
and ROS straining. 



Yeast Strain Construction 

All strains used in this study are listed in Table SI. Plasmid and oligo se- 
quences are available upon request. Complete disruption of ORFs was carried 
out using PCR-mediated gene disruption (Rose et al., 1990). mrd^^ mutant 
strains were made either via a loop-in, loop-out of a plasmid containing the 
mutant, followed by PGR verification, or via plasmid transformation of the 
mutant into an mrdA strain. 

Growth of Mutants for Monitoring Early Loss of Telomerase 

ETI cells were produced by two methods: by sporulation of diploid hetero- 
zygote strains {tlclA/TLC1, est2-D530A/EST2) or by loss of a covering 
plasmid in a haploid telomerase-deficient background strain struck on 
solid media. Colonies underwent 2 days of growth at 30°C, were genotyped, 
and were grown overnight (five to ten generations) in YPD prior to 
experimentation. 

Microfluidics Technique Analyses of Mother Cells 

Mother cells were monitored for 2 days by repeated microscopic imaging as 
described (Xie et al., 2012; Zhang et al., 2012). Microposts contained within 
the microfluidic device were used to clamp mother cells in place while 
daughter cells were washed away by hydrodynamically controlled flow of 
the surrounding liquid medium. Cell-cycle durations analyzed here excluded 
the first cycle observed and the terminal two divisions. 
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Southern Blotting Analysis of Telomere Length 

Genomic DNA was prepared from cells from serial streaks on solid media after 
the indicated number of passages. Genomic DNA was then digested with Xhol 
and run on 0.8% agarose gels. DNA was transferred from the gels to Hybond N+ 
membranes and probed with end-labeled WT telomeric repeat oligonucle- 

otide (TGTGGTGTGTGGGTGTGGTGT) as described previously (Rose et al., 
1990; Sambrook and Russel, 2001) and visualized using a phosphoimager. 

Mitochondrial and SIRS Quantification 

Cells containing mito-tagged (mt)GFP were placed on a microfluidics chip as 
described above. Images were taken every 2 hr, and mtGFP intensity was 
measured relative to WT to determine volume of mitochondria. For Sir3-GFP 
foci, 1 1 images were taken for the Z stack and projected to a single image us- 
ing the maximum value of the column, and the fluorescence intensity of the foci 
was measured using customized software Cellseg 5.4. 

Statistical Analysis 

Lifespans were compared using the Wilcoxon rank-sum test. Significance for 
the variation of cell-cycle length was determined by F tests. We used Fisher’s 
exact test to determine the significance of frequency of RNR3 peaks in 
different genotypes. 

Quantification of RNR3 Peaks 

Fluorescence was measured every 30 min during lifespan tracking. Frequency 
of RNR3 peaks (at least 1.3-fold over background) for “still-dividing” mother 
cells was calculated as the number of peaks divided by all cell cycles occurring 
within that genotype and, for “terminal” mother cells, as the percentage of 
mother cell lifespans that contained a peak during or after the last two 
divisions. 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures, five 
tables, and seven figures and can be found with this article online at http:// 
dx.doi.org/10.1016/j.cell.2015.02.002. 
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essential to its function 

• Lack of CIpV and different sheath structure support an 
alternative functional state 
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SUMMARY 

Type VI secretion systems (T6SSs) are newly identi- 
fied contractile nanomachines that translocate 
effector proteins across bacterial membranes. 
The Francisella pathogenicity island, required for 
bacterial phagosome escape, intracellular replica- 
tion, and virulence, was presumed to encode a 
T6SS-like apparatus. Here, we experimentally 
confirm the identity of this T6SS and, by cryo 
electron microscopy (cryoEM), show the structure 
of its post-contraction sheath at 3.7 A resolution. 
We demonstrate the assembly of this T6SS by 
IglA/lgIB and secretion of its putative effector pro- 
teins in response to environmental stimuli. The 
sheath has a quaternary structure with handedness 
opposite that of contracted sheath of T4 phage 
tail and is organized in an interlaced two-dimen- 
sional array by means of 3 sheet augmentation. 
By structure-based mutagenesis, we show that 
this interlacing is essential to secretion, phagoso- 
mal escape, and intracellular replication. Our 
atomic model of the T6SS will facilitate design of 
drugs targeting this highly prevalent secretion 
apparatus. 

INTRODUCTION 

The type VI secretion system (T6SS) is a recently discovered 
(Bladergroen et al., 2003; Pukatzki et al., 2006; Silverman 
et al., 2012) and characterized (Basler et al., 2012, 2013; Ho 
et al., 2013; Kube et al., 2014) member of secretion systems 
of Gram-negative bacteria (Tseng et al., 2009). T6SSs are crit- 
ical to the virulence of many important human pathogens, 
including Vibrio cholerae, Saimonella enterica, Escherichia 
coii, Burkholderia pseudomaliei, and Pseudomonas aeruginosa. 
It delivers its protein effector into its prey cell by a contractile 
ejection apparatus similar to T4-like bacteriophage tails (Ak- 
syuk et al., 2009), R-type pyocins (Nakayama et al., 2000), 
and many other contractile nanomachines (Leiman and 
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Shneider, 2012), although T6SSs are several times longer 
than the other apparatuses (Ho et al., 2014). The TOSS is 
composed of a sheath and a tube anchored to the bacterial en- 
velope by a baseplate (Ho et al., 2014) (Figure 1A). Instead of a 
single sheath protein in phage tails and pyocins, the T6SS 
sheath is composed of two proteins, Vip/WipB in Vibrio, which 
are orthologs of Igl/VIgIB of F. tuiarensis (Broms et al., 2009; de 
Bruin et al., 2007). The contraction of the sheath drives the tube 
across its target membrane (Ho et al., 2014; Hood et al., 2010) 
(Figure 1 B). However, the atomic structure of the T6SS and mo- 
lecular interactions required for secretion are not known. 
Although the structure of Vibrio Vip/WipB outer sheath has 
recently been shown at 6 A resolution (Kube et al., 2014), 
that structure is insufficient to guide drug design and mutagen- 
esis studies. 

Francisella tuiarensis subsp. tuiarensis is a Gram-negative 
bacterium that causes a zoonotic infection, tularemia, in 
animals and humans (Ellis et al., 2002). By the airborne route, 
a few organisms can cause lethal pneumonia in humans; 
hence, F. tuiarensis is a potential agent of bioterrorism and 
classified as a Tier 1 Select Agent. F. tuiarensis and the 
highly related F. tuiarensis subsp. novicida are facultative intra- 
cellular pathogens that replicate within macrophages. After 
uptake by macrophages via looping phagocytosis (Clemens 
et al., 2005, 2012), the bacteria initially reside within a fibrillar- 
coated membrane-bound phagosome that resists fusion with 
lysosomes and exhibits limited acquisition of lysosomal 
markers; however, the bacteria subsequently disrupt their 
phagosomal membrane and replicate extensively in the host 
cell cytosol (Chong and Celli, 2010; Clemens and Horwitz, 
2007; Clemens et al., 2004). F. novicida has considerable ho- 
mology with F. tuiarensis, but it has only a single copy of the 
Francisella Pathogenicity Island (FPI), and it is of low virulence 
for humans; it thus serves as a more practicable subspecies for 
study. 

Here, we show that the FPI of F. novicida encodes a TOSS, 
and, by cryo electron microscopy (cryoEM), that the two 
proteins of its sheath, IglA and IgIB, are interdigitated into a sin- 
gle fold similar to that of the phage sheath. CryoEM reconstruc- 
tion at 3.7 A reveals that (3 sheet augmentation interlaces the 
two-dimensional array of the sheath, and structure-based 
mutagenesis demonstrates that this interlacing is essential to 
secretion. 
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RESULTS 

Environmental Stimuli Trigger Assembly of F. novicida 
T6SS and Secretion of Effector Proteins 

To facilitate structural and functional studies of the FPI-encoded 
T6SS-like apparatus, we engineered F. novicida to express IglA- 
GFP fusion protein in lieu of IglA from its chromosome (Fn-lglA- 
GFP). These bacteria show only a weak diffuse fluorescence 
when grown in standard liquid culture medium. However, within 
15 min of uptake by macrophages, 10% of the bacteria exhibit 
intensely fluorescent foci (Figures 1 C and 1 E), and this percent- 
age increases to ~70% by 1 day after uptake, by which time the 
bacteria have proliferated extensively in the macrophage cytosol 
(Figures 1D-1F). We used a split-GFP system (Cabantous and 
Waldo, 2006), with domains 1-1 0 of GFP fused to the C terminus 
of IglA and domain 1 1 of GFP fused to the N terminus of IgIB 
(lglA-GFP1-10 and IgIB-GFPII) to test the interaction of IglA 
and IgIB in formation of fluorescent foci. These bacteria are not 
fluorescent when grown in broth culture but exhibit intense 
GFP fluorescent foci after uptake and replication within macro- 
phages (Figures 1G and 1H). In contrast, F. novicida expressing 
lglA-GFP1-10/SodB-GFP1 1 (Figure 1 1) and F. novicida express- 
ing lglA-GFP1-10 without a GFP11 partner (Figure 1J) do not 
form green fluorescent structures, despite their capacity to repli- 
cate extensively within macrophages (Figure 1 K). Western blot- 
ting confirmed expression pattern of IglA/B-split GFP constructs 
(Figure 1 L). 

Because FPI mutants are unable to permeabilize their vacu- 
oles, formation of the IglA/lgIB containing T6SS and secretion 
of its effectors is believed to precede and to be required for 
phagosome disruption and cytosolic escape by Franciseiia. 
The extremely high expression at late stages of infection sug- 
gests that the FPI-encoded T6SS is required for events after 
phagosome permeabilization. Consistent with this, (1) most FPI 
genes are induced during intramacrophage growth and show 
high levels of expression at late time points after bacterial uptake 
(Golovliov et al., 1997; Wehrly et al., 2009), when most bacteria 
are cytosolic, and (2) we have observed that IgIC is required 
for intracellular growth of F. novicida that are micro-injected 
directly into the cytosol of HeLa cells. 

In addition to intracellular growth in macrophages, two other 
conditions induce formation of fluorescent structures within 
Fn-lglA-GFP or Fn-lglA/B-split GFP: (1) placement beneath cov- 
erslips (either sealed by silicone or not) at room temperature for 
more than 3 hr (Figures 2A-2E) and (2) growth in broth containing 
5% KCI (but not 5% NaCI) (Figures 2F-2J). The fluorescent foci 
do not form in Fn-lglA-GFP with a deletion of igiB in response 
to any of the above three conditions (Figure SI), suggesting 
that IgiB subunits are required for assembly of the T6SS-like 
structure. The stimulus for assembly of fluorescent foci provided 
by placement beneath coverslips is unclear but could include 
physical stimulation, modulation of oxygen tension, or other 
environmental conditions. The fluorescent foci form with similar 
kinetics by bacteria sandwiched between a polystyrene surface 
and glass coverslip or between two glass surfaces but form more 
slowly when the glass slide is replaced with a 25 pm gas-perme- 
able Lumox (Starstedt) film base, consistent with a role for oxy- 
gen tension. 



Whereas by fluorescence microscopy Basler et al. (2012) 
demonstrated highly dynamic assembly and disassembly of 
VipA-GFP-labeled structures of live Vibrio, we have not observed 
similar rapid disassembly of the IglA-GFP or IglA/B-split GFP 
fluorescent structures in individual live bacteria. The slower turn- 
over of the F. novicida T6SS may reflect the absence of an iden- 
tifiable protein in the F. tuiarensis genome to disassemble the 
T6SS. Although the F. novicida genome lacks a protein with ho- 
mology to CIpV, we cannot rule out the possibility of a protein 
with no such sequence homology that fulfills this role. However, 
F. novicida IgiB, a VipB homolog, lacks the latter’s identified 
CIpV-interacting motif (Pietrosiuketal., 2011): an a-helical region 
at the N terminus of VipB (including a consensus sequence 
LLDEIM, residues 19-24 of VipB). Indeed, Protein-Protein 
BLAST analysis indicates that the N-terminal 56 amino acids of 
VipB have no sequence homology with any region of IgiB, and 
the N terminus of F. novicida IgiB has no similar a-helical region 
or consensus sequence. The GFP tags on IglA and IgiB are not 
likely to hamper their interaction with any disassembling chap- 
erone because (1) we observe the same slow turnover for both 
F. novicida expressing IglA-GFP (GFP fused to the C terminus 
of IglA) and F. novicida expressing the split GFP construct (IglA 
with GFP domains 1-10 fused to its C terminus and IgiB 
with GFP domain 11 fused to its N terminus), and (2) for 
V. choierae, rapid turnover was observed even for VipA-GFP 
(C-terminal fusion), indicating that the corresponding VipA-GFP 
fusion in V. choierae does not prevent rapid CIpV-mediated 
disassembly. If F. novicida has a functional homolog of CIpV, 
then its interactions with the T6SS sheath are likely to differ 
markedly from those in V. choierae. 

Secretion of VgrG by F. novicida has been reported previously 
(Barker et al., 2009). We prepared FLAG-tagged VgrG and grew 
the bacteria in the presence or absence of KCI. FLAG-VgrG and 
IgIC are both detected in the culture supernatant in the presence, 
but not in the absence, of KCI (Figure 2K). Deletion of /g/A abol- 
ishes the release of VgrG and IgIC into the culture medium 
(Figure 2K). 

IglA/lgIB Heterodimers Assemble to Form Franciseiia 
T6SS Sheaths 

We purified assemblies containing IglA-GFP or IglA from lysates 
of bacteria grown in trypticase soy broth supplemented with 
0.2% L-cysteine (TSBC) in the presence of KCI. Both proteins 
sedimented in equilibrium to below 55% sucrose on sucrose 
gradients. Negatively stained TEM images of the fraction con- 
taining IglA-GFP or IglA showed rod-shaped particles of variable 
length, similar to Vibrio choierae TOSS sheaths with or without 
GFP tag (Basler et al., 2012) (Figures 2L-2M). Rod-shaped 
sheath particles containing IglA-GFP, but not IglA, are recog- 
nized by immunogold labeling for GFP (Figure S2). 

We recorded cryoEM image stacks (movies) (Campbell et al., 
2012) with a Gatan K2 direct electron detector (McMullan et al., 
2009) operated at counting mode (Li et al., 2013) (Figure 3A); ob- 
tained a three-dimensional structure to 3.7 A by an integrative 
approach that implements iterative helical real space recon- 
struction (IHRSR) (Egelman, 2010) within the Relion (Scheres, 
2012) framework (Figure 3B); and built an atomic model of the 
T6SS apparatus (Figure 3C) (see also Experimental Procedures). 
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Figure 1. F. novicida Expressing GFP-Tagged IglA or IglA/B-Split GFP Form Fluorescent Foci within Macrophages, Consistent with T6SS 
Assembly 

(A and B) Proposed T6SS model in pre-contraction (A) and post-contraction (B) conformation. IM, inner membrane; OM, outer membrane. 

(C-F) F. novicida expressing IglA-GFP assemble fluorescent foci after uptake by macrophages, with 10% doing so at 15 min of infection (C) and 70% at 22 hr of 
infection (D). F. novicida bacteria are stained with a red fluorescent antibody; host and bacterial DNA is stained blue with DAPI; and arrows indicate bacteria 
shown at higher magnification in the insets. Scaie bars, 10 pm (insets, 1 pm). The percentage of Fn-lglA-GFP with fluorescent foci (E) and the number of F. novicida 
per TFIP-1 macrophage (F) were determined at each time point. Data shown represent the mean ± SE of measurements of at least 144 cells per time point. The 
experiment was done three times with similar results. 

(G-J) Requirement for IglA/B interaction for formation of the fluorescent foci was demonstrated inF. novicida expressing lglA-GFP1-10/lglB-GEP1 1 split GFP (Fn- 
IglA/B-split GFP) and controls. At 15 min post-infection ofTHP-1 macrophages (G), ~10% of Fn-lglA/B-split GFP form fluorescent structures at their poles, and by 

(legend continued on next page) 
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Figure 2. F. novicida Assemble Fluorescent 
Foci in Response to High KCI and Place- 
ment beneath a Coverslip and Secrete 
T6SS Effector Proteins in Response to Envi- 
ronmental Stimuli 

(A-E) Placement beneath a glass coverslip: 
Fn-lglA-GFP (A and B) and Fn-lglA/B-split GFP 
(C and D) in TSBC were placed beneath a silicone 
sealed coverglass and imaged immediately (Fn- 
IglA-GFP, A; Fn-lglA/B-split GFP, C) or after 4 hr at 
room temperature (Fn-lglA-GFP, B; Fn-lglA/B- 
split GFP, D). Images shown are merged green 
fluorescence and phase contrast images, with 
phase contrast shown in blue. Bacteria exhibit 
either a diffuse weak fluorescence or no fluores- 
cence immediately after placement beneath a 
glass coverslip (A and C) but exhibit intense fluo- 
rescence at their poles after 4 hr at room tem- 
perature (Fn-lglA-GFP, B; Fn-lglA/B-split GFP, D). 
Scale bar, 2 iim. (E) Time course of formation of the 
fluorescent structures by Fn-lglA/B-split GFP after 
placement beneath a coverslip at room tempera- 
ture. Data are represented as means ± SEM. 

(F-J) High KCI: Fn-lglA-GFP or Fn-lglA/B-split 
GFP were inoculated into TSBC at an OD 540 nm 
of 0.05 and grown at 37°C overnight to late 
exponential phase (OD 1.0-1 .4) in the absence 
(Fn-lglA-GFP, F; Fn-lglA/B-split GFP, H) or pres- 
ence of 5% KCI (Fn-lglA-GFP, G; Fn-lglA/B-split 
GFP, I). Bacteria exhibit only a weak diffuse fluo- 
rescence in the absence of KCI (F and H) but 
develop intense fluorescence at their poles in 
response to 5% KCI (G and I). Scale bar, 2 |.im. See 
also Figure SI. (J) Growth and kinetics of forma- 
tion of fluorescent foci in Fn-lglA/B-split GFP in 
TSBC with and without 5% KCI. Fn-lglA/B-split 
GFP was inoculated into TSBC broth with or 

without 5% KCI at an optical density of 0.05 and grown at 37°C rotating at 250 rpm. OD (red lines) and percentage of bacteria with fluorescent foci (blue lines) in the 
presence (solid squares) or absence (open squares) of KCI were monitored over time. 

(K) Secretion in response to high KCI: VgrG and IgIC are secreted by F. novicida with intact TOSS growing in TSBC with (left), but not without 5% KCI (right). WT, 
wild-type; A/B, Fn-lglA/B-split GFP; FV, Fn-FLAG-VgrG; AA FV, Fn-FLAG-VgrG MglA. 

(L-M) Sheath-like macromolecular structures purified from wild-type and IglA-GFP expressing F. novicida. F. novicida expressing native (L) or GFP-tagged (M) 
IglA assemble similar sheath-like macromolecular structures. Negatively stained TEM images of density gradient fractions from F. novicida expressing wild-type 
IglA (L) or IglA-GFP (M) show rod-shaped structures of variable length. 

See also Figure S2. 
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Like contractile phage tails (Leiman and Shneider, 2012), the 
sheath is organized in a cylindrical, helical form with an axial 
6-fold rotational symmetry (Figure 3 and Movie SI). The sheath 
is constructed with discs of six heterodimers of IglA/lgIB proteins 
(Figure 3D and Movie S2). The validity of the map and the model 
is demonstrated by the deep grooves in a helices, well-sepa- 



rated 3 strands, and match of side chains in the sequence and 
in the density map (Figures 3E and 3F). 

The asymmetric unit of the helical sheath contains one copy 
each of IglA and IgIB. The folds of the IglA/lgIB proteins can be 
thought of as a “split” of the single protein of a bacteriophage 
sheath with many insertions (Figures 4A-4D and Movie S3). 



22 hr after infection (H), most of the bacteria exhibit green fluorescent structures. In contrast, F. nowc/da expressing lglA-GFP1-10/SodB-GFP1 1 (l)andF. novicida 
expressing lglA-GFP1-10 without a GFP11 partner (J) do not form green fluorescent structures, despite their capacity to replicate extensively within the 
macrophages. Panels on the left showF. novicida stained with a red fluorescent antibody (in black and white), the middle panels show GFP green fluorescence (in 
black and white), and the panels on the right show merged color images with DAPI-stained host and bacterial DNA in blue. Arrows in (G) indicate bacteria with 
intense GFP fluorescent structures at 1 5 min post-infection, one of which is shown at higher magnification in the inset. Scale bars, 1 0 jim (inset, 1 |.im). 

(K) Kinetics of growth of IglA/B-split GFP and control strains in THP-1 macrophages. THP-1 macrophages were infected with Fn-lglA/B-split GFP or control 
strains, incubated at 37°C for 0.5, 2, or 22 hr, fixed with 4% paraformaldehyde, and permeabilized with 0.1 % saponin, and the bacteria were stained with a red 
fluorescent antibody. The number of bacteria per macrophage nucleus was determined by automated counting using CellProfiler Software. All strains show 
capacity to grow in macrophages. Data are represented as means ± SEM. 

(L) Western immunoblot confirms expression pattern for IglA/B-split GFP constructs. Lane 1, Fn. Lane 2, Fn-lglA-GFP. Lane 3, Fn-lglA-GFP1-10. Lane 4, Fn- 
IglA/B-split GFP. Lane 5, Fn-lglA/SodB-split GFP. M, molecular mass standards, as indicated to the left of the figure. Primary antibody used is indicated on the 
right for each blot. 
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Figure 3. CryoEM and Atomic Model of F. novicida T6SS 

(A) A representative cryoEM image of F. novicida T6SS recorded on a direct- 
electron detector. The yellow box marks one sheath. 

(B) CryoEM density map of the F. novicida T6SS sheath displayed as shaded 
surface (left) and cut-away view (right), both colored according to radius. See 
also Movie SI. 

(C) Atomic model of F. novicida T6SS with six discs shown. The 12-start 
helices (ridges) are colored alternatively with rainbow colors and gray. See also 
Movie S2. 

(D) Hexagonal disc formed by six IglA/lgIB heterodimers of F. novicida T6SS 
shown in two different orientations. Each heterodimer is displayed as a ribbon 
diagram of a different color. 

(E-F) Two representative density regions (chicken-wire) with their atomic 
models (sticks, colored by atoms; C, green; N, blue; O, red; S, yellow). 

Joining together, these two proteins form an a-|3-a sandwich. The 
central p sheet of this sandwich is formed by interdigitation of 
strands from both IglA and IgIB (Figure 4E). The sheath fold splits 
between its first and second a helices, and each of the two ends 
of the split is appended with an insertion (box in Figure 4D). Parts 
of these two insertions are disordered (C-terminal amino acids 
136-184 of IglA and N-terminal amino acids 2-78 of IgIB [ques- 
tion marks in Figure 4D]). The domain near the C terminus of 
IgIB (cyan in Figures 4C and 4D) highly resembles the C-terminal 
domain of the phage sheath proteins (Aksyuk et al., 2009, 201 1). 




Figure 4. Atomic Model and Architecture of T6SS Sheath and 
Comparison with Bacteriophage T4 

(A and B) Atomic model of an IglA/lgIB dimer shown as ribbons with domains 
colored as indicated in the bars below (B). 

(C) Superposition of the atomic models of the IglA/lgIB dimer (colors) and 
bacteriophage T4 sheath (gray) (Leiman et al., 2004). See also Movie S3 and 
Figure S3. D1-D4, domains 1-4. 

(D) Secondary structure diagram of an IglA/lgIB dimer, colored as in (A) and (B). 
Elements in the box are insertions at the “split.” 

(E) The central p sheet of an IglA/lgIB dimer is made of interdigitated strands 
(inside oval) from both subunits. 

(F) Ribbon diagram of IglA/lgIB centered on the third and fourth helices of IgIB 
(marked by two dashed lines) (left) and comparison between the schematics of 
this vicinity in F. novicida T6SS and in T4 phage (right). 

(G and H) Comparison between the quaternary structures of post-contraction 
F. novicida T6SS (G) and T4 phage (H) sheaths showing opposite handedness. 
The angles between their 1 2-start helices and their helical axes are marked on 
their sides, respectively. The semi-transparent shaded surfaces of the density 
map of the T6SS (G) sheaths at 1 0 A resolution is fitted with six discs of ribbon 
models colored as in Figure 3C. The density map of T4 phage sheath at 1 7 A is 
taken from EMDB entry EMD-1086. 

See also Figure S4. 



F. novicida T6SS and Bacteriophage T4 Outer Sheaths 
Show Divergent Quaternary Organization Despite 
Secondary and Tertiary Structural Homology 

The secondary structure and tertiary structural arrangement of 
the F. novicida T6SS sheath are similar, but not identical, to 



those of the bacteriophage T4 sheath (Fokine et al., 2013). 
The central p sheet of the F. novicida T6SS is formed by inter- 
digitating two proteins (Figure 4E). Notably, phage T4 sheath 
protein has an insertion of two domains before a long helix (Fig- 
ures 4C and 4F, see lower right schematic, and Movie S3), 
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Figure 5. Structure-Based Mutagenesis of 
F. novicida T6SS 

(A) Ribbon model of three (marked by numbers) 
interacting IglA/lgIB dimers with their augmented p 
sheet circled. 

(B) Magnified view of the circled p sheet in (A). 

(C) Ribbon diagram of this augmented p sheet. The 
chains marked “1,” “2,” and "3” in (B) and (C) 
come from the IglA/lgIB dimers marked with the 
same number in (A). 

(D-M) F. novicida expressing lglAAl8N/B-split 
GFP (D-G) or lglA/BA25C-split GFP (l-L) form 
fluorescent structures in response to KCI (D versus 
E, I versus J) or incubation under coverslips 
(F versus G, K versus L). Green fluorescence and 
phase contrast images are merged with phase 
contrast shown in blue. Scale bar, 2 iim. Ribbon 
diagrams depict deletions in the N terminus of IglA 
(H) and C terminus of IgIB (M). 

(N) Western blots showing that FLAG-tagged VgrG 
(21 kDa) and IgIC (22 kDa) are secreted into culture 
filtrate in response to KCI by the parental Fn-lg- 
lA/B-split GFP strain but not by the strains ex- 
pressing truncated IglA or truncated IgIB. A/B, 
Fn-lglA/B-split GFP; AM8N/B, Fn-lglAAl8N/B- 
split GFP; A/BA25C, Fn-lglA/BA25C-split GFP. All 
Fn-lglA/B split GFP parental and mutant strains 
express FLAG-VgrG. Samples were run on a sin- 
gle gel. Gray lines indicate gaps where irrelevant 
lanes were removed. 

(O) Growth curves demonstrating that the loss of 
T6SS function in the deletion mutants renders 
them unable to multiply intracellularly in human 
macrophages. Whereas the parental F. novicida 
IglA/B-split GFP grows 1.7 logs in THP-1 human 
macrophages in 1 day, the split-GFP strains ex- 
pressing truncated IglA (AA18N/B) or truncated 
IgIB (A/BA25C) show markedly impaired growth at 
a level equivalent to that of IglA-GFP-expressing 
strains with a deletion in iglB (A/AB, Fn-lglA-GFP 
A/g/B) or igiC (A/AC, Fn-lglA-GFPA/g/C). CFU, 
colony forming unit. Data are represented as 
means ± SEM. 



whereas in T6SS sheath, this iong heiix is broken into two 
shorter heiices, its third and fourth heiices from its N terminus. 
In the case of F. novicida, the third heiix tiits up and is 
aimost perpendicuiar to the fourth heiix (Figure 4F, dashed 
iines in ribbon diagram, and upper right schematic). In stark 
contrast, the quaternary structural organization of the T6SS 
sheath is markediy different from that of the post-contraction 
T4 (Leiman et ai., 2004) sheath (Figures 4G and 4Fi; see aiso 
Figure S3). The heiix with the shortest, non-zero pitch (a six- 
start heiix, biack dashed iine in Figures 4G or 4Fi) in the T6SS 
sheath has a ieft-handed turn of 33.4° (Figure S3D) and a rise 
of 20.8 A per subunit (whereas the T4 sheath has a right- 
handed turn of 32.9°); thus the two heiicai architectures mirror 
each other. Despite opposite handedness, iayer lines in the 
Fourier spectra of the EM images (Figure S4) indicate that the 
other parameters of the heiicai symmetry of the F. novicida 
T6SS sheath and the post-contraction T4 sheath are simiiar, 
suggesting that this T6SS sheath structure is in its post- 
contraction state. 



F. novicida T6SS Outer Sheath Has a Highly Interlaced 
Two-Dimensional Array Architecture with Augmented p 
Sheets that Is Essential to Secretory Function 

The two-stranded p sheet of the C-terminai domain of igiB is 
augmented on one side by arms that emanate from the two 
neighboring igiA/lgiB dimers from a disc above (Figures 5A- 
5C, S3B, and S3C), forming a four-stranded sheet. By this 
augmentation, the arms from hundreds of copies of IgiA/igiB 
interiace a two-dimensionai array (Figures S3A and S3C). We hy- 
pothesized that this two-dimensionai interiacing is cruciai to the 
integrity and thus the contractility and secretory capacity of the 
F. novicida T6SS. To test this hypothesis, we carried out struc- 
ture-based mutagenesis experiments for residues centered on 
the augmented p sheet. We prepared IgiA/B-spiit GFP mutant 
strains iacking either N-terminai amino acids 2-18 of igiA (Fn- 
IgiAAl 8N/B-spiit GFP) or the iast 25 amino acids at the C-termi- 
nai of IgiB (Fn-lgiA/BA25C-spiit GFP). We predicted that both 
mutant strains wouid show normai assembiy of their pre- 
contraction T6SS organelies but that the T6SS sheath wouid 
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Figure 6. IglA N-Terminal or IgIB C-Terminal Deletion Mutants Are Defective in Phagosome Escape 

(A) Unlike the parental strain (Fn-lglA/B-spllt GFP), Fn-lglAA18N/B-spllt GFP and Fn-lglA/BA25C-split GFP (stained with a green fluorescent antl-F. novicida 
antibody, shown In black and white In the left panels) are unable to escape their phagosome and remain within CD63-positlve compartments (stained with a red 
fluorescent antibody, shown in black and white In the middle panels) In human macrophage-llke THP-1 cells, both at 6 hr and 26 hr. Arrows Indicate bacteria 
stained with the antl-F. novicida antibody. The panels on the right show merged color Images with F. novicida stained green, CD63 stained red, and host and 
bacterial DNA stained blue with DAPI. 

(B) Cytosolic versus phagosomal localization of F. novicida was determined using a differential digitonin permeabilization antibody staining procedure. F. novicida 
bacteria accessible to antibody after digitonin permeabilization (which permeabilizes the plasma membrane but not the phagosomal membrane) were stained 
with a red fluorescent anti-F. novicida antibody, and all bacteria were subsequently stained with a green fluorescent antibody after saponin permeabilization. 
Whereas many of the parental strain F. novicida are stained by the red fluorescent antibody after digitonin permeabilization at 6 hr post-infection (arrowheads), 
none of the Fn-lglAA18N/B-split GFP and Fn-lglA/BA25C-split GFP (arrows) are stained by the red fluorescent antibody either at 6 hr or 26 hr after infection. 
Parental bacteria that remain inaccessible to antibody after digitonin permeabilization at 6 hr are indicated by arrows. At 26 hr post-infection, the parental 
F. novicida have proliferated extensively within the cytosol, and the majority are accessible to the red fluorescent antibody after digitonin permeabilization. The 
Fn-lglAA18N/B-split GFP and Fn-lglA/BA25C-split GFP bacteria (indicated by arrows) show much more limited replication within the macrophage and remain 
inaccessible to the red fluorescent antibody. Scale bars, 10 nm. These experiments were performed three times with similar results. 



become unstable with attempted contraction and thus fail to 
secrete effector proteins. 

Indeed, both the IglA and IgIB truncation mutants expressing 
the truncated IglA/B with the split GFP tags formed fluorescent 
foci In response to 5% KCI or Incubation beneath coversllps (Fig- 
ures 5D-5M), Indicating focal multimerizatlon of the IglA/B-heter- 
odlmers. Flowever, the mutants were completely defective In 
secretion of IgIC (Figure 5N) and had markedly Impaired growth 
In TFIP-1 macrophage-like cells, with growth kinetics similar to 
iglB and iglC deletion mutants (Figure 50). Like the iglB and 
iglC deletion mutants, the Fn-lglAA18N/B- and Fn-lglA/BA25C- 



split GFP mutants remain within compartments that stain posi- 
tive for the lysosomal marker CD63 (Figure 6A) and are unable 
to escape their phagosomes (Figure 6B). 

DISCUSSION 

By employing the spllt-GFP technology, we Identified three stim- 
uli that trigger the assembly of IglA/lgIB sheaths InF. novicida-. (1) 
late exponential phase with high bacterial density in the pres- 
ence of high KCI, (2) placement beneath a coversllp, and (3) 
the Intramacrophage environment. Although there Is precedent 
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for demonstration of T6SS regulation by quorum sensing (Ishi- 
kawa et al., 2009; Lesic et al., 2009), salinity (Ishikawa et al., 
2012), and membrane perturbation (Basler et al., 2013; Ho 
et al., 2013), we are unaware of prior reports of T6SS assembly 
induced selectively by KCI or by intracellular residence. We hy- 
pothesize that KCI is a stimulus for T6SS assembly in the intra- 
cellular environment and that higher concentrations are required 
in liquid culture to compensate for the absence of other intracel- 
lular stimuli. The factors stimulating T6SS when bacteria are 
placed beneath a coverslip are unclear and could involve phys- 
ical stimuli, altered oxygen tension, or other environmental influ- 
ences. We confirmed that one stimulus, high KCI in liquid culture, 
induced secretion of putative T6SS effector proteins, VgrG and 
IgIC, as would be expected for induction of the secretion system. 
We purified sufficienf amounts of IglA/lgIB sheaths from bacteria 
stimulated with this condition for high-resolution cryoEM struc- 
tural studies. Taking advantage of the latest technological break- 
through— direct electron counting— we determined the structure 
of this sheath to 3.7 A, permitting the de novo atomic modeling of 
IglA/lgIB and the confirmation of its identity as the T6SS in 
F. novicida. This structure shows a highly interlaced two-dimen- 
sional array that is critical to the secretion function. 

As with other contractile nanomachines, the F. novicida T6SS 
propels its central tube by contraction of its sheath assembly. An 
intriguing feature of the sheath structure is its two-dimensional 
interlacing formed by augmented p sheets. It is likely present in 
the pre-contraction form of the T6SS sheath as well and plays 
a critical role during contraction. Our work directly reveals the 
importance of this interlaced structure to the contractile function 
of this T6SS and lays the foundation for understanding the pre- 
sumably shared mechanism of other contractile nanomachines. 

Results utilizing the truncation mutants of IglA/lgIB prove the 
critical role of the interlacing formed by augmented p sheets dur- 
ing contraction; in addition, the intensive polymerization of Igl/V 
IgIB, as reported by the fluorescent foci, suggests that the T6SS 
sheath array assembles without interlacing. Therefore, an addi- 
tional factor must contribute to T6SS assembly. By analogy 
with bacteriophages, we propose that the sheath assembles 
on a preassembled IgIC tube, whose monomer structure is avail- 
able (PDB: 2QWU). Close scrutiny of the IgIB and IgIC surfaces 
suggests a potential protein-protein interface for IgIB and IgIC, 
given favorable rotamers of residues (Figure S5). However, reli- 
able assignment of this interaction requires the structure of 
pre-confraction T6SS. 

It was thought, based on its limited sequence homology, that 
the F. tularensis FPI-encoded T6SS is an outlier among others 
(Filloux et al., 2008). Most notably, F. tuiarensis T6SS system 
lacks a homolog for CIpV ATPase, which is linked to T6SS disas- 
sembly (Filloux et al., 2008). Nevertheless, Franciseiia has homo- 
logs to other key T6SS proteins, including Igl/VIgIB, VgrC, DotU, 
and PdpB; Franciseiia IgIC is structurally homologous to Pseu- 
domonas Hep (de Bruin et al., 2011). Remarkably, these differ- 
ences and similarities coincide with differences and similarities 
between our structure of the F. novicida T6SS sheath and the 
other available T6SS structure (Kube et al., 2014); although the 
dimeric organization and the fold of the sheath proteins and 
the two-dimensional interlacing among dimers are the same 
for both, the quaternary structural organizations are different. 



Whereas the turn per subunit of the contracted Vibrio T6SS 
sheath is 30.56°, giving a 3° tilt off the vertical to the 12-start he- 
lices (Kube et al., 2014), our T6SS sheath has a turn per subunit 
of 33.4°, giving an ^20° tilt to the 12-start helices (Figure 3C). 

It is always tempting but challenging to give an explanation for 
such a difference. Although a facile explanation is that the 
F. novicida T6SS is an outlier, its lack of a CIpV ATPase suggests 
an alternative explanation. In fact, the Vibrio and Pseudomonas 
T6SS sheaths are severed into shorter pieces, whereas our 
purified F. novicida T6SS sheath is typically much longer, some- 
times more than five times longer (Lossi et al., 201 3). A few more 
helices are resolved in the 6 A structure of the Vibrio choierae 
T6SS sheath (Kube et al., 2014) than in our sheath structure. 
These helices belong to the part of sheath proteins that interacts 
with CIpV (Kube et al., 2014). Because the function of CIpV is to 
facilitate disassembly of the post-contraction sheath, the Vibrio 
T6SS structure might be in a state that is closer to disassembly 
than our structure. Our structure is closer to the immediate post- 
contraction state because it shares a similar turn per subunit 
number to post-contraction bacteriophage T4 sheath (Leiman 
et al., 2004). If so, then theTGSS sheath would have three states: 
pre-contraction, post-contraction, and pro-disassembly. This 
model awaits confirmation by structural analyses of a CIpV 
knockout mutant of Vibrio choierae. 

We show here that VgrC and IgIC are secreted by F. novicida in 
an IglA-dependent fashion in response to environmental stimuli, 
confirming secretory activity of the Franciseiia FPI-encoded 
apparatus. Secretion of effectors through this T6SS within mac- 
rophages is required for phagosome escape and cytosolic repli- 
cation because deletion mutants lacking IglA, IgIB, IgIC, or VgrC 
(Barker et al., 2009; de Bruin et al., 201 1) and the contraction- 
incompetent mutants lglAA18N/B- and lgl/VBA25C-split CFP 
presented here were all unable to escape the phagosome and 
markedly impaired in intramacrophage replication. 

Cn the other hand, our Fn-lglAA18N/B- and Fn-lgl/VBA25C- 
split CFP mutants stand out from other IglA and IgIB mutants 
studied to date. Cur split-CFP data suggest that the mutated 
proteins still interact to form Igl/VIgIB-heterodimers thaf can 
assemble green-fluorescent macromolecular structures (fluo- 
rescent foci), yef show loss of secretory function. This is a 
more precise manipulation of the protein functions owing to 
the availability of an afomic structure. In contrast, the previously 
reported mutations are not based on a structure; mutants among 
IglA residues 102-129 (Broms et al., 2009) caused loss of inter- 
action with IgIB (i.e., failure to form the Igl/VIgIB heterodimer). 
Particularly, substitution of alanine for valine at residue 109 of 
IglA or at the corresponding residue 110 of Vibrio VipA was 
shown to abolish IglA-lgIB and VipA-VipB interaction, respec- 
tively (Broms et al., 2009). Nevertheless, our atomic model pro- 
vides an explanation for such loss of inferaction: Vail 09 of IglA 
is participating centrally in a hydrophobic core formed between 
IglA and IgIB. The V109A mutation destroys this hydrophobic 
interaction and weakens IglA-lgIB interaction. In addition to the 
above structure-based functional studies, our atomic model pro- 
vides many other targets for further mutagenesis studies. 

In conclusion, we established the identity and determined the 
atomic structure of an FPI-encoded T6SS in Franciseiia, the 
genus that includes the highly pathogenic bacterium and Tier 1 
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Select Agent, F. tularensis subsp. tularensis. Our atomic struc- 
ture of the F. novicida T6SS and structure-based mutational 
analysis reveal the critical Importance of the Interlaced architec- 
ture to secretion. This atomic model will facilitate the design and 
testing of therapeutics targeting this and similar bacterial secre- 
tion apparatuses, which are pivotal to the virulence of many 
pathogenic Gram-negative bacteria. 

EXPERIMENTAL PROCEDURES 
Bacteria 

F. tularensis subsp. novicida strain L)1 1 2 (F. novicida) and the derivative strains 
were cultivated in TSBC. For targeted gene deletion and epitope tagging, up- 
stream and downstream chromosomal regions flanking the gene of interest 
were amplified with in-frame deletion of the gene or with the gene fused to 
the coding sequence of an epitope tag. The amplified fragments were inserted 
into pMP590 (LoVullo et al., 2006) by traditional cloning method using restric- 
tion endonucleases or Gibson Assembly (New England BioLabs) and 
confirmed by sequencing. The resulting plasmid constructs were chemically 
transformed into F. novicida for allelic exchange. Additional information on 
F. novicida strains used in the study is provided in Table SI . 

Kinetics of Formation of Fluorescent Foci 

Bacteria were inoculated in TSBC with and without 5% KCI at an initial optical 
density at 540 nm (OD) of 0.05 and grown at 37°C rotating at 250 rpm. Optical 
density and percentage of bacteria with fluorescent foci were monitored over 
time. To observe formation of fluorescent foci beneath a glass coverslip, we 
placed bacteria between a glass slide and a silicone-sealed (and also 
unsealed) coverglass and imaged immediately or after 3-6 hr at room 
temperature. 

Secretion of Effector Proteins 

Bacteria were grown to late exponential phase in TSBC with or without 5% 
KCI, pelleted by centrifugation, and the supernate was filtered through a 
0.45 micron filter. The culture filtrate (equivalent to 1 ml culture at an OD of 
1.5) and bacterial pellet (equivalent to 5 x 10^ bacteria) were analyzed by 
SDS-PAGE and western immunoblot. IglA and IglA-GFP fusion variants were 
detected by rabbit polyclonal antibody to His-tagged recombinant IglA (BEI 
Resources) or affinity-purified rabbit polyclonal antibody to native GFP (Milli- 
pore); IgIB was detected by murine monoclonal antibody specific to His- 
tagged recombinant IgIB (BEI Resources); IgIC was detected by rabbit 
polyclonal antibody to highly purified recombinant IgIC; and FLAG-VgrG was 
detected by murine monoclonal antibody M2 to FLAG epitope (Sigma). Anti- 
bodies to IglA and IgIB were obtained through the NIH Biodefense and 
Emerging Infections Research Resources Repository, NIAID, NIH. 

Macrophage Infection and Immunofluorescent Staining 

Human monocytic THP-1 cells were differentiated on poly-L-lysine-coated 
coverslips with phorbol 12-myristate 13-acetate (100 nM) for 3 days. 
F. novicida strains were grown overnight in TSBC to OD 540 nm of 1-1.5; 
opsonized with 10% human AB serum for 10 min at 37°C at an OD of 0.002; 
diluted in Dulbecco’s Modified Eagle’s Medium (DMEM) with 10% heat-inac- 
tivated fetal bovine serum (HI-FBS) to an OD 540 nm of 0.0002; and added to 
the monolayers of THP-1 cells in a 24-well plate. To synchronize infection, we 
pelleted the bacteria onto the monolayers by centrifugation of the plates at 
800 g for 30 min at 4°C. The plates were warmed to 37°C for 30 min; the mono- 
layers were washed twice with DMEM to remove non-internalized bacteria; 
and the second wash was replaced with fresh DMEM containing 10% HI- 
FBS. In experiments intended to follow growth of bacteria by determination 
of colony forming units (CFU), 0.1 iig/ml gentamicin was added to the culture 
medium to restrict extracellular growth of the bacteria. For immunofluores- 
cence experiments, the monolayers were fixed for 30 min in 4% paraformalde- 
hyde in PBS; washed with PBS; permeabilized with 0.1 % saponin in PBS with 
10 mM lysine; and stained forF. novicida using a chicken anti-F. novicida anti- 
body (kind gift of Professor Denise Monack, Stanford University), followed by a 



Texas-red-conjugated goat anti-chicken IgY antibody (Life Sciences). Host 
cell and bacterial DNA were stained with DAPI (1 ).ig/ml), and the coverslips 
were mounted with Prolong Gold anti-fade mounting medium (Life Sciences) 
and viewed with an Eclipse TE2000 (Nikon) inverted fluorescence microscope 
equipped with FITC, Texas red, and DAPI filter cubes and SPOT camera and 
software or with a SPS2 (Leica) confocal microscope. The late endosomal/ 
lysosomal marker, CD63 (LIMP), was stained using the H5C6 hybridoma 
culture supernate obtained from the University of Iowa Developmental Studies 
Hybridoma Bank. The differential digitonin assay was performed by a modifi- 
cation of the assay described previously (Checroun et al., 2006) and as we 
have described previously (Gillespie et al., 2013). The chicken antibody to 
F. novicida was used immediately after digitonin permeabilization to detect 
cytosolic bacteria and a mouse IgGs monoclonal antibody to F. novicida ob- 
tained from ImmunoPrecise (Victoria, BC, Canada) was used after saponin 
permeabilization to detect all bacteria. 

Purification of T6SS 

Wild-type F. novicida and Fn-lglA-GFP were grown to late exponential phase in 
TSBC containing 5% KCI; pelleted by centrifugation (7500 g for 15 min at 4°C); 
and lysed with lysozyme and 1 % TX-100 detergent in 20 mM Tris HCI, (pH 7.8) 
with 1 mM EDTA, protease inhibitor cocktail III (1:1,000, EMD Millipore), and 
Benzonase nuclease (1:1,000, EMD Millipore). The lysate was centrifuged 3 
times at 1 5,000 g for 30 min at 4°C to pellet bacterial debris, and the supernate 
was carefully layered onto a 10%-55% sucrose gradient overlying a 55% 
Optiprep cushion and centrifuged at 100,000 g for 18 hr. Fractions were 
collected and analyzed by TEM negative staining and by western immunoblot- 
ting. Fractions were diluted 1 :50-1 :500 and examined by TEM negative stain- 
ing using 2% uranyl acetate. We observed that the 1 50 nm-200 nm sheath-like 
structures sedimented to just below the 55% sucrose/Optiprep interface, 
thereby allowing these structures to be purified away from the majority of 
the bacterial debris. The fractions with the greatest purity as judged by TEM 
negative staining were dialyzed against 20 mM Tris HCI, (pH 7.5) 0.9% NaCI 
(TBS) and concentrated with a 100,000 MWCO centrifugal concentrator 
(Am icon). 

Immunogold Electron Microscopy 

Fractions of T6SS purified as described above from cultures of wild-type 
F. novicida or F. novicida expressing IglA-GFP were diluted 100-fold with 
TBS and applied to glow discharged formvar-coated nickel grids. After 
5 min, the grids were washed with TBS, blocked for 30 min with 20 mM HEPES, 
0.15 M NaCI containing 1 % BSA (HBS-BSA), and stained with rabbit anti-GFP 
(1 :1 ,000) for 60 min in the HBS-BSA. The grids were washed with HBS; stained 
for 60 min with 5 nm protein A gold in HBS-BSA (University of Utrecht); washed 
with HBS; negatively stained with 2% uranyl acetate; and examined by TEM 
usinga JEOL100 CXII. 

Cryo Electron Microscopy, Image Processing, and Resolution 
Assessment 

2.5 ^l purified T6SS outer sheaths of wild-type F. novicida purified as 
described above was applied to a pre-irradiated, or “baked” (Miyazawa 
et al., 2003) 200 mesh Quantifoil grid (1 .2 iim hole size) and vitrified inside an 
FEI vitrobot with 1 00% humidity. CryoEM images were collected at liquid nitro- 
gen temperature in an FEI Titan Krios cryo electron microscope operated at 
300 kV with parallel illumination using a Gatan K2 Submit direct electron detec- 
tor in counting mode (Li et al., 2013) with a dosage of 25 e/A^ and a nominal 
magnification of 29,000x and using the Leginon package (Suloway et al., 

2009) . Final pixel size is 1.00 A/pixel. The defocus range of these images is 
1 .5-3.5 i-im underfocus. Image stacks were preprocessed according to the 
method described previously (Li et al., 2013). 

T6SS particles were selected manually with EMAN (Ludtke et al., 1 999) heiix- 
boxer. The three-dimensional structure was reconstructed with modified (Ge 
et al., 2010; Ge and Zhou, 2011) IHRSR (Egelman, 2010) with Relion (Scheres, 
2012) as the refinement engine (see below). Our initial guess of the helical pa- 
rameters was originated from those of the post-contraction T4 sheath (Leiman 
et al., 2004), which we then refined to convergence with IHRSR (Egelman, 

2010) . An initial refinement was accomplished with EMAN-based modified 
IHRSR, and the resulting structures were taken as the initial reference for the 
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final refinement using Relion-based IHRSR. The final refinement was done us- 
ing the “auto-refine” function of Relion, which automatically calculates the 
working resolution for each iteration and determines the final convergence 
(see below). The refinement for T6SS converged at 26 iterations. The box 
size was 640 pixels, and the overlapping was set to 64 pixels. A total of 
480,000 asymmetric units were included in the refinement process and in 
the final reconstruction. The effective resolution of the final reconstruction 
was estimated to be 3.7 A based on considerations of “gold-standard” FSC 
(Figure S6A) and consistence with structural features observed in the density 
map (Movie S4). 

Implementation of IHRSR in Relion Package 

We implemented IHRSR (Egelman, 201 0) in the Relion 1 .2 package (Scheres, 
2012). The procedure is very similar to the “modified IHRSR” used in our pre- 
vious studies (Ge et al., 2010; Ge and Zhou, 201 1 ): the volume is refined for one 
iteration with Relion without helical symmetry, then the program refines the he- 
lical symmetry by re-implemented IHRSR module hsearch (though non-linear 
curve fitting is not implemented), the new helical symmetry is applied in real 
space to the volume by re-implemented IHRSR module himpose, and the he- 
lically symmetric volume is masked (as in Relion) and passed to the next iter- 
ation as its reference. The helical turn and rise step sizes used in hsearch are 
first set to the orientation and translation search step sizes in Relion divided by 
the number of helical copies in the volume, respectively {hsearch step-size = 
Relion search step-size/box size/pixel size * helical rise per subunit). Then 
another helical search {hsearch) is performed about the newly found helical pa- 
rameters of the first hsearch with lOx finer step sizes. We implement both 
hsearch and himpose modules so that the volume under operation is over- 
sampled when necessary to minimize interpolation error (Ge and Zhou, 201 1). 

We implemented existing and new switches to control the IHRSR behavior. 
Two switches to turn on the helical search and symmetrization, two switches to 
set cylindrical radii, two switches to set initial helical parameters, one switch to 
set the oversampling ratio, and one switch to set the fraction of volume in Z di- 
rection (helical axis) at the center to be considered in helical search {hsearch) 
and symmetrization {himpose). To save time, the oversampling ratio of the 
hsearch module is set to half that of the himpose module, unless himpose is 
not oversampled. For 3D classification, each class has its own bookkeeping 
of helical parameters. For auto-refinement, both subsets share the same heli- 
cal symmetry. 

For auto-refinement, the two subsets are refined independently, except that 
they share the same helical parameters. After each Relion refinement, each 
subset is subjected to hsearch module if so switched, the two pairs of refined 
helical parameters are averaged, and the averaged parameters are used for 
himpose. The FSC that is used to determine the working resolution of the 
refinement and to regularize the refinement is calculated between the two sub- 
sets before helical symmetrization and is "calibrated” in the attempt to ac- 
count for helical symmetrization as follows. The FSC factors are converted 
to signal-to-noise ratio factors by S/N = FSC / (1 - FSC). The latter is used 
to calculate the “calibrated” S/N ratio by multiplying the square root of the 
number of copies to be helically averaged, and the “calibrated” S/N factors 
are used to calculate the final “calibrated” FSC factors, which are passed 
on to the “maximization" step in Relion and its following iteration for regulari- 
zation purposes. (See Figure S6A for a comparison between gold-standard 
and “calibrated” FSC curves between final maps from half datasets.) 

Atomic Model Building and Refinement 

A model for T6SS sheath was built de novo using Coot (Emsiey et al., 201 0) and 
was refined with CNS (Brunger, 2007) and then with Phenix (Adams et al., 
2010). Non-crystallographic symmetry, i.e., the helical and 6-fold rotational 
symmetry, was used as a restraint among identical chains. A total of six discs 
were included in the working atomic model. This model was ultimately refined 
to 3.7 A, with R-factor24.4% and last shell R-factor46.0%. The FSC curve be- 
tween the model and the map crosses 0.5 at 3.75 A (Figure S6A). To rule out 
potential over-fitting during model refinement, we have redone the model 
refinement to 4.2 A, which is 0.5 A less than the map resolution of 3.7 A. The 
correlation (FSC) between the resulting model and the full-resolution map 
doesn’t drop sharply at 4.2 A, suggesting that over-fitting is unlikely, but the 
FSC value reaches 0.5 at 3.88 A. The slight loss of resolution of the model 



refined against the 4.2 A map is expected because, for example, side-chain 
orientation might not be reliably determined at the reduced resolution. 

Figures for the cryoEM maps and atomic models were prepared with UCSF 
Chimera package (Pettersen et al., 2004). 

ACCESSION NUMBERS 

The cryoEM density map and atomic model have been deposited to EMDB 
and PDB under the accession numbers EMD-6266 and 3J90, respectively. 
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SUMMARY 

Bacteria use rapid contraction of a long sheath of the 
type VI secretion system (T6SS) to deliver effectors 
into a target cell. Here, we present an atomic-resolu- 
tion structure of a native contracted Vibrio choierae 
sheath determined by cryo-electron microscopy. 
The sheath subunits, composed of tightly interacting 
proteins VipA and VipB, assemble into a six-start 
helix. The helix is stabilized by a core domain assem- 
bled from four 3 strands donated by one VipA and 
two VipB molecules. The fold of inner and middle 
layers is conserved between T6SS and phage 
sheaths. However, the structure of the outer layer 
is distinct and suggests a mechanism of interaction 
of the bacterial sheath with an accessory ATPase, 
CIpV, that facilitates multiple rounds of effector 
delivery. Our results provide a mechanistic insight 
into assembly of contractile nanomachines that 
bacteria and phages use to translocate macromole- 
cules across membranes. 

INTRODUCTION 

Several critical components of the type VI secretion system 
(T6SS) are structurally and functionally related to components 
of contractile tails of bacteriophages. Secreted VgrG and 
P/V\R proteins form a complex similar to phage spike, secreted 
Hep is a structural homolog of a phage tube protein, and TssE 
(type six secretion E) is a homolog of T4 phage baseplate protein 
gp25 (Leiman et al., 2009; Pukatzki et al., 2007; Shneider et al., 
2013). VipA and VipB (TssB and TssC) proteins were shown to 
form a cog-wheel-like tubular structure in V. choierae (Bone- 
mann et al., 2009) that was noticed to resemble T4 phage 
gpl 8 polysheath (Leiman et al., 2009). The Vip/WipB sheath as- 
sembles around an inner Hep tube and is attached to a structure 
called a baseplate that spans the bacterial membranes (Basler 
et al., 2012; Brunet et al., 2014; Zoued et al., 2013). Importantly, 
the Vip/WipB sheath was shown to form a long contractile 
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organelle in V. choierae (Basler et al., 2012; Kapitein et al., 
2013) and in £. coll (Brunet et al., 2013), suggesting that sheath 
contraction powers the secretion. In vivo, the contracted sheath 
is specifically recognized by the CIpV ATPase, which disassem- 
bles the sheath by unfolding VipB from the N terminus (Basler 
and Mekalanos, 2012; Kapitein et al., 2013; Pietrosiuk et al., 
2011). Even though sheath contraction has been implicated in 
powering protein translocation across a membrane for phages, 
pyocins, and T6SS (Leiman and Shneider, 2012), a mechanistic 
understanding of this process is currently limited, mostly due to 
the lack of a high-resolution structure of a native and fully assem- 
bled sheath. 

There is no high-resolution information available for the T6SS 
sheath, but several crystal structures are available for fragments 
of phage sheath proteins (Aksyuk et al., 2009a, 201 1). However, 
a major limitation of these atomic structures is that monomeric 
proteins were used for crystallization and thus, in principle, 
cannot reveal atomic details of inter-subunit interactions in a 
native fully assembled sheath polymer. The structure of the T4 
phage sheath polymer was so far solved only at low resolution 
using electron microscopy (Kostyuchenko et al., 2005; Leiman 
et al., 2004), again not providing the necessary details to fully 
understand sheath assembly. 

Native T6SS sheath has only been isolated from V. choierae in 
a contracted form (Basler et al., 2012). Even though the sheath 
was isolated without the inner Hep tube, Hep and other compo- 
nents of T6SS were shown to be necessary for sheath assembly 
(Basler et al., 2012; Brunet et al., 2014; Kapitein et al., 2013). 
Indeed, in contrast to a long and regular T6SS sheath that can 
be isolated from T6SS-positive V. choierae (Basler et al., 2012), 
Vip/WipB from P. aeruginosa and V. choierae heterologously ex- 
pressed in E. coll only form short tubes (Bonemann et al., 2009; 
Kube et al., 2014; Loss! et al., 2013); electron microscopy of 
these tubes provided low-resolution density maps (Kube et al., 
2014; Lossi et al., 2013). Nonetheless, a recent ~6 A resolution 
structure of V. choierae sheath provided insights into a possible 
mechanism of CIpV-specific disassembly of the contracted 
sheath (Kube et al., 2014). 

Due to recent advances in direct electron detection cameras 
and software tools (Egelman, 2010; Faruqi et al., 2003; Li et al., 
2013; Lu et al., 2014), it is now possible to obtain density maps 
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with a resolution that allows de novo building of atomic 
models (Kuhibrandt, 2014). These technical advances allowed 
for directly generating atomic models of the subunit of the mito- 
chondrial ribosome (Amunts et al., 2014) or the rlbosome-Sec61 
complex (Voorhees et al., 2014) and provided fundamental In- 
sights Into mechanisms of those macromolecular machines. 
Here, we used the state-of-the-art electron microscopic ap- 
proaches and the Rosetta density-guided structural modeling 
methods to reveal the structure of the contracted VIpAA/lpB 
sheath from 1/. cholerae In atomic detail. 

RESULTS AND DISCUSSION 

Atomic Structure of the VipA/VipB Protomer 

We purified the native contracted sheath from Vibrio choierae 
and imaged it by cryo-electron microscopy (Figure 1A). Fourier 
transforms of recorded images showed Thon rings up to ~3 A 
with layer lines in single micrographs up to a resolution of 5 A 
(Figure SI A). Helical reconstruction was performed by the Itera- 
tive helical real space reconstruction (IHRSR) method (Egelman, 
2000) with the final helical parameters being a 21 .8 A axial rise, 
29.4° rotation, and a C6 rotational symmetry about the helical 
axis (Figures IB, SIB, and SIC and Movie SI). Helical parame- 
ters and an overall shape of the sheath are similar to the previ- 
ously reported structure (Kube et al., 2014); however, our 
approach allowed us to obtain a resolution of ~3. 5-4.0 A, which 
improved up to ~3.2 A for the inner and middle layers of the 
sheath (Figure SI D). Most of the amino acid side chains and 
some oxygen atoms In the backbone were resolved In the 
most ordered parts of the structure (Figure 1C and Movie SI). 
Even though the resolution of our protein density decreased for 



the outer surface layer, we were able to 
de novo trace residues 2 to 126 (out of 
168) of VIpA and residues 61 to 492 of 
VIpB (Figures 2A, 2B, and S2A-S2E). 
The VIpA C terminus and the VipB N ter- 
minus were clearly localized to a less or- 
dered layer on the surface of the sheath 
as shown In class averages of sheath 
Images (Figures ID, S1E, and S1F). To 
further Improve the geometry of the side 
chains, the model was subject to Rosetta 
density-guided all-atom refinement using a physically realistic 
energy function (Song et al., 2013; DIMaloet al., 2015). Anatomic 
model built Into an Independently generated electron micro- 
scopy (EM) map of lower resolution had a Ca root-mean-square 
deviation (RMSD) to the original atomic model of 0.34 A 
(see Experiential Procedures), suggesting that model building 
Is highly reliable. Analysis of the conservation and coevolution 
ofVipAA/ipB protein residues shows that the conserved residues 
are generally facing the Inner part of the protomer, variable 
residues are exposed on the protomer surface, and distances 
between most coevolving residues are within 10 A (Figure S3 
and Table SI). 

The atomic model allowed us to calculate energies of protein- 
protein Interactions using PISA (Proteins Interfaces Structures 
and Assemblies) (Krissinel and Henrick, 2007) and understand 
how the sheath structure is stabilized in its contracted form. 
The strongest intermolecular interaction was calculated between 
VIpA and VipB to form a heterodimeric sheath protomer with 1 :1 
stoichiometry (Table 1 and Figures 2A, 2B, and S2F). Two 3 
strands of VIpA and four 3 strands of VipB Intertwine, forming 
the middle layer of the sheath (domain 2, Figure 2D). VIpA further 
binds to one side of VipB, forming 35 hydrogen bonds and 1 4 salt 
bridges. The total Interfacial area for this interaction is 3,493 A^, 
and AG = -54.8 kcal/mol/protomer represents more than half of 
all the interaction energy within the assembled sheath (Table 1). 



Interprotomer Interactions 

The Interaction surface between VipB proteins on the same 
helical strand covers ~2,444 A^, represents about 20% of the 
total Interaction energy, and stabilizes the protomers within the 
strand. The interface area between VipA and VipB from adjacent 




Figure 1. Cryo-EM Structure of the T6SS 
Sheath 

(A and B) (A) A representative low-dose cryo-EM 
micrograph with side (red box) and top (red circie) 
views of the sheath. Scaie bar, 100 nm; (B) as- 
sembly of the protomers into a six start heiix; s, 
individuai strands; r, horizontai rings. Scaie bar, 
10 nm. 

(C) An exampie of the atomic modei fitted into the 
protein density. 

(D) Left: a class average of the sheath showing 
a density on the surface; right: protein density 
fiitered to low resolution showing density of the 
VipB N terminus and VipA C terminus. 

See also Figure SI and Movie SI . 
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Figure 2. Atomic Model of the Sheath 
Protomer 

(A) An atomic model for VipA (pink) and VipB 
(blue) with the outlined domains. Domain 3 con- 
tains untraced residues predicted to form 5 a 
helices (location marked with an asterisk [*]). 

(B-E) (B) Interactions of secondary structure ele- 
ments in the protomer; (C) scheme of the hand- 
shake domain assembly by three protomers of 
VipAA/ipB; (D) two views of domain 2, six p strands 
surrounded by 5 a helices stabilizing the interaction 
between VipA and VipB in the protomer; (E) a 
“handshake domain” in the domain 1 connecting 
four p strands: p1 2 and p1 3 of VipB in strand A, ring 
1 (light blue) with p14 of VipB in strand A, ring 2 
(blue), and p1 of VipA in strand B, ring 2 (red). 

See also Figure S2 and Movie SI . 



helical strands is 1 ,1 43 and contributes 4.7 kcal/mol/proto- 
mer energy to the stabilization of the individual strands within the 
six-start helix. Together, these interactions represent an energy 
of 34 kcal/mol/protomer and are the major contributors to sheath 
stability (Table 1). 

Resolution limitations of the previous study (Kube et al., 2014) 
led to an imprecise segmentation of a sheath subunit from 
the low-resolution density map (Figure S1G). Interestingly, we 
show that the subunits are connected by a unique set of interac- 
tions in the innermost layer of the sheath. This “handshake” 
domain is assembled from two anti-parallel |3 strands (pi 2 and 
pi3)of one VipB molecule, one parallel p strand (pi4) of a second 
VipB on the same six-start helical strand, and one parallel p 
strand (pi) of a VipA molecule from a neighboring strand in the 
six-start helix (Figures 2B, 2C, and 2E). 

T6SS and Phage Sheaths Evolved from a Common 
Ancestor 

To understand the evolution of T6SS sheath, we performed a 
structural alignment between VipAA/ipB and a model of T4 phage 
sheath protein gpl 8 (Aksyuk et al., 2009a; Fokine et al., 201 3) and 
a crystal structure of Listeria innocua phage sheath protein 
Lin1278 (Aksyuk et al., 2011). In contrast to sequence-based 
alignments that only detect homology between VipB and phage 
sheath proteins, we show that domains 1 and 2, composed of 



both VipA and VipB, are highly conserved 
and the outer domains 3 and 4 are diver- 
gent (Figures 3 and S4). Domain 1 of 
T6SS sheath and the domain 1 of a model 
of gpl 8 or a crystal structure of Lin1278 
align with RMSD of 2.7 A and 2.2 A, 
respectively. RMSDs between the domain 
2 of the T6SS sheath and the crystal struc- 
tures of the domain 2 of gpl 8 or Lin1278 
are 3.7 A and 2.8 A, respectively. 

Interestingly, the architecture of domain 
1 differs between phage and T6SS. In 
both phage sheath proteins, the first two 
p strands have the same orientation as in 
the T6SS sheath, but the third p strand 
has an opposite orientation, and the fourth p strand is missing 
(Figures 3A and 3B). Because the phage sheath structures 
were solved for monomers and not for fully assembled polymers, 
it is tempting to speculate that, in a fully assembled phage 
sheath, the corresponding handshake domain has the same ar- 
chitecture as in the native T6SS sheath and connects subunits 
and strands in the same manner as in T6SS. 

The fundamental difference between phage and T6SS sheath 
is that phage sheath is used only once, whereas T6SS sheath is 
recycled in vivo by CIpV (Basler and Mekalanos, 2012; Kapitein 
et al., 2013). Moreover, phages act in an extracellular space, 
whereas the T6SS sheaths are functioning in bacterial cyto- 
plasm. Here, we show that the major difference between phage 
and T6SS sheaths is in the outer layer, which is not only structur- 
ally different but also positioned differently on the sheath sur- 
face. In the case of the T4 phage sheath, the domains 3 and 4 
are inserted between pi and H3 of VipB in the domain 2 (Figures 
3A, 3B, and S4). On the other hand, the T6SS sheath has its 
domain 3 inserted between HI of VipA and H2 of VipB (Figures 
3A, 3B, and S4). This leads to a major difference in the angle 
between domain 3 and domain 2 compared to phage sheath 
architecture. Furthermore, the outermost layer of the phage 
sheaths is formed mostly by p strands (Aksyuk et al., 2009a, 
2011), whereas the T6SS sheath outer layer is predicted to be 
composed of five a helices (Figures S2A and S2B). 
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Table 1 . Energy of Interactions between VipA/VipB in the Sheath Assembly 




Interface Area, A^ 


Number of H-H Bonds 


Number of Salt Bridges 


AG kcal/moi 


% of Interaction Energy 


VipA-B main interface 


3493 


35 


14 


-54.8 


57 


VipB-B interaction in the strand 


2444 


33 


14 


-19.3 


20 


VipA-B between the strands 


1143 


7 


2 


-14.7 


15 


VipA-B in the horizontal ring 


634 


11 


4 


-4.5 


5 


VipB-B vertical interaction 


454 


5 


2 


-3.0 


3 



Even though the overall fold of domains 1 and 2 of phage 
and T6SS Is conserved, the T6SS sheath has several potentially 
functional Insertions compared to phage sheath (Figure 4A). 
The VipA/VipB protomer has two weakly conserved extra helices 
in the domain 1 : VipB HI 7 and VipB H21 . VipB H1 7 (aa 374-386) 
interacts with a loop of the next VipA in a strand (aa 1 8-24, orig- 
inating from the handshake domain). A weakly conserved loop 
and a short VipB H21 interact with a loop (aa 412-415) close to 
VipB H19. 

As hypothesized previously (Baslerand Mekalanos, 2012; Ka- 
pitein et al., 2013; Kube et al., 2014), after the sheath contraction, 
the VipB N terminus is likely exposed on the sheath surface to 
allow disassembly by CIpV. Although an atomic model of an 
extended T6SS sheath is not available yet, it is likely that the 
N terminus of VipB is not accessible for binding by CIpV in the 
extended state to prevent disassembly of the extended sheath. 
We show that domain 3 is exposed on the surface of the con- 
tracted sheath, aligning the domains 3 from the neighboring 
strands on top of one another. This is in agreement with the 
recently proposed model (Kube et al., 2014); however, here we 
show that two helices from VipAc and three from VipBn are 
exposed on the surface. This indeed makes the VipB N terminus 
fully accessible for disassembly by CIpV (Figure 3D), as sug- 
gested previously (Kube et al., 2014), but raises a possibility 
that VipA is involved in properly positioning VipB on the sheath 
surface. Furthermore, our atomic model suggests that precise 
positioning of domain 3 could be stabilized by interactions of 
three T6SS-specific insertions into the surface of VipB in domain 
2: short helices H8-H13, a loop R246-N276, and an outward fac- 
ing hairpin p7-p8. These insertions appear to form a network of 
hydrophobic interactions with the domain 3 at the outer surface 
of the sheath (Figure 4B). Hairpin p7-|38 forms an interaction 
with the H8-H13 of the VipB in the neighboring strand and with 
the loop VipB246-276. Loop VipB246-276 interacts with the 
two long helices VipA H4 and VipB HI of the domain 3 from the 
inside, whereas the other hairpin VipA p3-p4 stabilizes them 
from the outside. The two long helices are further stabilized by 
a helix-helix interaction with the conserved interfaces (Figure 4A). 

Attachment of the Sheath to the Baseplate 

Whole-cell cryo-electron tomography provided only a low-reso- 
lution structure of the sheath (Basler et al., 2012), and therefore 
it is not possible directly from those data to orient the VipA/ 
VipB structure relative to a baseplate in the bacterial cell wall. 
However, considering the degree of structural similarity between 
T6SS and phage, it is likely that VipA and VipB are oriented rela- 
tive to the baseplate in the same way as gpl 8 in T4 phage (Aksyuk 
et al., 2009a). In Figure 3A, and all other similar top views, the 



baseplate would be located behind the plane of view; on all 
side views, like the inset of Figure 3A, the baseplate would be 
located on the bottom . This orientation of the Vip/WipB protomer 
suggests that two p strands per subunit are free to bind to an un- 
known T6SS component in the baseplate (Figure 3A). A natural in- 
teracting partner for those two p strands would be a structure 
similar to an “empty” 2-p-stranded handshake domain organized 
in a hexameric ring similarly to an actual sheath ring. 

A search for structural homologs of the T6SS sheath revealed 
that protein NP_952040.1 from Geobacter sulfurreducens (PDB: 
2IA7), a homolog of the T4 phage baseplate protein gp25, 
aligns with the T6SS sheath domain 1 with an RMSD of 2.7 A 
(Figure 5A). As noted previously (Leiman and Shneider, 2012), 
phage sheath domain 1 has a fold that is similar to that of 
gp25-like protein (Figure 5B). Importantly, gp25 is positioned 
near the sheath in theT4 phage baseplate (Aksyuk et al., 2009b). 

In a fully assembled handshake domain of T6SS sheath, the 
orientation of the third p strand (counting from the lumen of the 
sheath) is parallel to the second p strand but antiparallel in crystal 
structures of gp25 and its homolog (Figures 5A and S5). We de- 
tected significantly coevolving, and thus potentially interacting, 
residues only between the first two p strands of gp25 (Figure S5). 
This suggests that, similarly to the sheath handshake domain, 
only two p strands of gp25 are present in a native assembly. 
The third p strand of the gp25 could flip out of the domain and 
interact with yet another component of the baseplate. Therefore, 
gp25 could accept two additional p strands from interacting pro- 
teins in a similar mechanism to the mechanism of sheath subunit 
interaction. 

Interestingly, T6SS component TssE was suggested to be a 
homolog of gp25 (Leiman et al., 2009; Lossi et al., 201 1), co-pu- 
rifies with the T6SS sheath in 17. cholerae (Basler et al., 2012), and 
is important for sheath assembly (Basler et al., 2012; Kapitein 
et al., 2013). We therefore speculate that the TssE protein could 
be the part of the T6SS baseplate that accepts VipA-pl and 
VipB-pi4 strands of the first sheath ring and thus initiates the 
sheath assembly and also anchors the sheath to the baseplate 
(Figure 5C). Moreover, TssG and TssK were shown to copurify 
with sheath in V. cholerae (Basler et al., 2012), and VipB was 
shown to interact with TssK in £. coll (Zoued et al., 2013), sug- 
gesting that additional proteins are likely involved in attaching 
the sheath to the baseplate as well. A stable attachment of a 
contractile sheath to a baseplate is likely crucial for generation 
of the force needed to deliver substrates across target cell 
membranes. The sheath has to bind to the baseplate as strongly 
as individual sheath rings bind together— otherwise, the 
sheath would likely detach from the baseplate during a rapid 
contraction. 
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Interactions in the Handshake Domain Are Critical for 
T6SS Sheath Assembly and Dynamics 

Our structural data indicate that interactions between p strands 
in domain 1 are important for initiation of sheath polymerization, 
extension, and potentially also for sheath contraction. To test 
this, we generated truncated versions of VipA and VipB lacking 
P1 and P14, respectively. In a background of a fully functional 
VipA-msfGFP chromosomal fusion, we show that deletion of 
vipB abolishes sheath assembly and target cell killing (Figures 
6A and 6C and Movie S2). As shown in Figures 6A-6C and Movie 
S2, sheath assembly and target cell killing can be restored by a 
wild-type level of expression of a full-length VipB from a plasmid, 
but not by a similar level of expression of a mutant lacking P14 
(VipB-AC). This indicates that a connection between the sheath 
protomers on the same strand is essential for sheath assembly 
and T6SS function. 

To assess the role of pi strand of VipA, we compared dy- 
namics of a full-length VipA-sfGFP expressed in vipA deletion 
background with dynamics of pi strand deletion mutant (VipA- 
AN). As shown on Figure 6D and Movie S3, the wild-type sheaths 
rapidly assemble and contract in almost all cells during 5 min. 
Sheaths with disrupted domain 1 are capable of assembling 
into structures resembling extended wild-type sheaths but 
exhibit very little dynamics (Figures 6D and 6E and Movie S3). 
On average, we observe only one assembly event per ~500 cells 
over 5 min. Furthermore, the time of sheath assembly is 
increased for the VipA-AN sheath to about 2 min (Figure 6E), 



Figure 3. Structural Homology between the 
T6SS and Phage Sheaths 

(A and B) Structural alignment of VipAA/ipB (pink/ 
biue) with (A) modei of fuil-iength T4 phage sheath 
gp18 (PDB: 3J2N), the inset shows a side view 
from the sheath lumen; (B) L. innocua phage 
sheath Lint 278 (PDB: 3LML). Structuraiiy homol- 
ogous domains 1 and 2 of phage sheaths are 
shown in brown; divergent domains 3 are shown in 
red for phage tails and in green for Vip/WipB. 

(C) A scheme depicting domain organization of 
Vip/WipB, gp18, and Lint 278 (partially adapted 
from Leiman and Shneider [2012]). 

(D) One ring of protomers showing N terminus of 
VipB exposed to the outer surface of the sheath 
making it accessible to be disassembled by CIpV. 
See also Figure S4. 



whereas most of wild-type sheaths fully 
assemble in 20 to 40 s under the same 
conditions. This clearly indicates that a 
fully assembled handshake domain is crit- 
ical for efficient sheath assembly initiation 
and the fast assembly rate of the T6SS 
sheath. Interestingly, even though we 
inspected sheath dynamics in ~50,000 
cells over 5 min, we identified only 5 ex- 
amples of unambiguous sheath contrac- 
tion and disassembly (one example is 
given in Figure 6E). This suggests that 
the ability to contract is preserved to 
some degree but raises the possibility that domain 1 is involved 
in triggering sheath contraction in vivo. Alternatively, the rate 
of sheath assembly may play a role in triggering sheath 
contraction. Target cell killing in vipA deletion background was 
restored by expression of VipA-sfGFP, but not by expression 
of VipA-AN-sfGFP mutant (Figure 6C), suggesting that mere 
ability to assemble sheaths is not sufficient forTSSS-dependent 
killing. 

Concluding Remarks 

The unusual four-strand p sheet handshake domain assembled 
from three molecules invites comparisons with other protein 
polymers. In most protein filaments that have been intensively 
studied, such as F-actin (von der Ecken et al., 2014; Galkin 
et al., 2015), microtubules (Alushin et al., 2014), bacterial flagellar 
filaments (Yonekura et al., 2003), or type IV pill (Craig et al., 2006), 
subunits can be treated as compact, and the assemblies are 
held together by the exclusion of solvent at the buried interfaces 
(Miller et al., 1987). In contrast to these, type I pill from bacteria 
have a polymerization mechanism that involves an N-terminal 
extension of one subunit that becomes a p strand within a p 
sheet of an adjacent subunit (Waksman and Fluitgren, 2009). 
This p sheet formed by two subunits gives a remarkable stability 
to the filaments and allows type I pill to resist very large shear 
forces (Castelain et al., 2011; Miller et al., 2006). We expect 
that this architecture allows the sheath to transfer a large force 
and remain intact during its rapid contraction. 
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Figure 4. Divergence of the T6SS Sheath from the Phage Sheath 

(A) VipA/VipB protomer (pink/blue) with the additional insertions compared to 
phage sheath Lin1278 marked in yellow; residues with sequence conservation 
over 70% are marked in red. 

(B) Interaction network in the outer domain of VipAA/ipB. BH 8-1 3* are part of 
VipB in the neighboring subunit. 

See also Figure S3. 



EXPERIMENTAL PROCEDURES 
Bacterial Strains and DNA Manipulations 

\f. cholerae 2740-80 parental and AvipA strains and the pBAD24-VipA-sfGFP 
plasmid were described previously (Basler et al., 2012). pBAD24-VipA-AN- 
sfGFP plasmid was created by replacing vipA gene in pBAD24-VipA-sfGFP 
plasmid with a gene lacking codons encoding 23 N-terminal amino acids using 
standard methods. V. cholerae 2740-80 vipA-msfGFP strain was created by 
replacing vipA on the chromosome with vipA-msfGFP fusion by sacS-medi- 
ated allelic exchange using the pWM91 suicide plasmid as described previ- 
ously (Basler and Mekalanos, 2012; Basler et al., 2012; Bina and Mekalanos, 
2001; Metcalf et al., 1996). msfGFP differs from previously used sfGFP by 
Val 206 to Lys substitution, which was previously described to reduce dimer- 
ization of GFP (Zacharias et al., 2002). Comparison of VipA-msfGFP to VipA- 
sfGFP expressed from pBAD24 plasmid in AvipA strain revealed no difference 
in dynamics of the fusion proteins (data not shown). The linker between VipA 
and msfGFP was 3xAla 3xGly, as used previously on pBAD24 plasmid (Basler 
et al., 2012). To limit effects of the fusion gene on the downstream genes in the 
TOSS locus, we added the last 21 bp of vipA at the end of vipA-msfGFP. 
V. cholerae 2740-80 vipA-msfGFP AvipB strain was created by replacing 
vipB with a gene encoding “MMSTTEKGRLDQA” peptide (first seven and 
last six residues of vipB fused in frame) by allelic exchange as described above 
and was done previously (Basler et al., 201 2). Standard techniques were used 
to clone a PCR-amplified vipB or the first 477 codons of vipB to pBAD24 



plasmid (Guzman et al., 1995) to generate pBAD24-VipB and pBAD24-VipB- 
AC plasmids, respectively. All PCR-generated products were verified by 
sequencing. Plasmids were transformed to V. cholerae by electroporation. 
Gentamicin-resistant E. coli MG1655 strain was used in bacterial killing 
assays. Strain list provided as Table S2. 

Antibiotic concentrations used were streptomycin (100 |.ig/ml), ampicillin 
(200 [.Lg/ml), and gentamicin (15 iig/ml). Luria-Bertani (LB) broth was used for 
all growth conditions. Liquid cultures were grown aerobically at 37°C. 

Fluorescence Microscopy 

Procedures similar to those described previously (Basler et al., 2012) 
were used to detect fluorescence signal in V. cholerae. Overnight cultures 
of V. cholerae carrying pBAD24-vipA-sfGFP, pBAD24-vipA-AN-sfGFP, 
pBAD24-vipB, or pBAD24-vipB-AC were washed by LB and diluted 50 x 
into fresh LB supplemented with ampicillin, streptomycin, and 0.003% arabi- 
nose for VipA and 0.006% arabinose for VipB and cultivated for 2. 5-3.0 hr to 
optical density (OD) at 600 nm of about 0.8-1 .2. Strains without plasmid 
were grown similarly, but no ampicillin and arabinose was added. Cells from 
100 \i\ of the culture were re-suspended in 5-10 ).il of fresh LB (to OD ~10), 
spotted on a thin pad of 1 % agarose in LB, and covered with a glass coverslip. 
Cells were immediately imaged at room temperature using an objective heated 
to 37°C. Microscope configuration similar to the one described previously 
(Basler et al., 2013) was used: Nikon Ti-E inverted motorized microscope 
with Perfect Focus System and Plan Apo lOOx Oil Ph3 DM (NA 1 .4) objective 
lens. SPECTRA X light engine (Lumencore), ET-GFP (Chroma #49002) filter set 
was used to excite and filter fluorescence. sCMOS camera pco.edge 4.2 
(PCO, Germany) (pixel size 65 nm) and VisiView software (Visitron Systems, 
Germany) were used to record images. Fiji (Schindelin et al., 2012) was used 
for all image analysis and manipulations as described previously (Basler 
et al., 2013). Contrast on compared sets of images was adjusted equally. All 
imaging experiments were performed with three biological replicates. 

Bacterial Killing Assay 

V. cholerae 2740-80 strains and E. coli MG1655 strain were incubated 
overnight at 37°C in LB supplemented with appropriate antibiotics. Cultures 
were washed in fresh LB, diluted 1 00 x , and grown to OD 0.8-1 .2 in presence 
of appropriate antibiotics and 0.01 % arabinose for strains with pBAD24 plas- 
mids. Cells were washed and mixed at final OD of ~10 in 10:1 ratio {V. choierae 
to E. coil) as specified, and 5 ).il of the mixture was spotted on a dry LB agar 
plate containing 0.01 % arabinose but no antibiotics. After 3 hr, bacterial spots 
were cut out and the cells were re-suspended in 0.5 ml LB. The cellular sus- 
pension was serially diluted (1:10) in LB, and 5 j.il of the suspensions were 
spotted on selective plates (gentamicin for E. coii and streptomycin for 
V. cholerae). Colonies were counted after ~16 hr incubation at 30°C. Three 
biological replicates were analyzed. 

VIpA/VipB Sheath Purification 

Sheath was purified similarly to a previous method (Basler et al., 2012). An over- 
night culture oifIgG in-frame deletion mutant of the parental V. cholerae 2740- 
80 strain (Basler et al., 2012) was diluted 1 :200 into 1 ,000 ml of fresh LB and then 
shaken at 37°C for 2. 5-3.0 hr to reach an OD of 1 .0-1 .5. Cells were cooled on 
ice, centrifuged for 10 min 7,000 x g and lysed in 50 ml lysis buffer (150 mM 
NaCI, 20 mM Tris [pH 8], lysozyme 200 i-ig/ml, DNase I 50 |.ig/ml, 5 mM 
EDTA, 0.1 % SDS, 0.5% Triton X-100). Cell lysis was complete after 5-10 min 
incubation at 37°C. To activate DNase to cleave DNA, MgCl 2 was added to 
10 mM final concentration and, after 2-5 min incubation at 37°C, EDTA was 
added to reach 1 5 mM final concentration. Cell debris was removed by centri- 
fugation for 20 min at 1 0,000 x g. Cleared lysates were subjected to ultraspeed 
centrifugation at 1 50,000 x g for 1 hr at 4°C. Pellets were re-suspended in 1 ml 
in TND buffer (1 50 mM NaCI, 20 mM Tris [pH 8], 0.5% Triton-1 00, 0.1 % SDS), 
and insoluble material was removed by centrifugation for 1 min at 10,000 x g. 
Supernatant was diluted to 50 ml in TND buffer and subjected to ultraspeed 
centrifugation at 150,000 x g for 1 hr at 4°C. Pellet was washed with 2 ml of 
PBS and resuspended in 2 ml PBS. Insoluble material was removed by centri- 
fugation for 1 min at 1 0,000 x g. Supernatant was diluted to 50 ml by PBS and 
subjected to ultraspeed centrifugation at 1 50,000 x gfor 1 hrat4°C. Pellet was 
washed by 2 ml of PBS and resuspended in 1 ml PBS, and insoluble material 
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Figure 5. Evolutionary Conservation of the 
Handshake Interactions and Sheath Assem- 
bly Initiation 

(A) Structural alignment of the gp25-like phage 
protein from G. sulfurreducens (brown, PDB: 2iA7) 
and aT6SS sheath handshake domain containing 
VipB (light blue), VipB from the next subunit on 
the same strand (blue), and VipA from the next 
strand (red). 

(B) Alignment of the gp25-like phage protein from 
G. sulfurreducens (brown) and domain 1 of phage 
sheath Lin1278 (red). 

(C) A model for sheath assembly initiation and 
polymerization as viewed from inside the tube: 
recruitment of VipAA/ipB protomers (through their 
free p strand) to the baseplate protein TssE 
(providing 2 p strands); establishment of the full 4- 
p-stranded handshake domain starting with VipB, 
followed by VipA; recruitment of additional VipA/ 
VipB protomers to the newly formed ring. 

See also Figure S5. 



was removed by centrifugation for 1 min at 1 0,000 x g. Supernatant contained 
pure sheath. Purity was assessed by Coomassie stained gel, and protein con- 
centration was measured by standard approaches. 

Peptide-Specific Antibodies 

Antigen-purified rabbit polyclonal antibodies raised against VipB peptide 
QENPPADVRSRRPL were obtained commercially (GenScript, USA). Speci- 
ficity of the antibodies was tested on V. cholerae strains expressing or lacking 
vipB. 

Ceil Fractionation and Immunoblot Analysis 

Cells from 250 |.il culture prepared for imaging as described above were re- 
suspended in 1 00 \i\ PBS and subjected to sonication (20 cycles, 1 00% ampli- 
tude, 0.5 s cycle) (UIS21 5V Hielscher Ultrasonics GmbH, Germany). Then 1 0 i^l 
of 10% SDS was added, and the sample was sonicated as before. Samples 
were incubated for 10 min at 95°C, centrifuged, and 17 ^l were mixed with 
7 1 -lI 4x NuPAGE LDS Sample Buffer (Life Technologies) and 2 iil 1 M DTT. 
Samples were heated again for 10 min at 72°C, centrifuged, and loaded on 
4%-12% pre-cast polyacrylamide gels (Life Technologies) and transferred 
to nitrocellulose membrane (Amersham Biosciences, UK). Membrane was 
blocked by 5% milk in Tris buffered saline (pH 7.4) containing Tween 0.1% 
(TBST), incubated with primary peptide antibody for 16 hr at 4°C, washed 
with TBST, incubated for 1.5 hr with horseradish peroxidase-labeled anti- 
rabbit antibody (Jackson Lab), and washed with TBST, and peroxidase was 
detected by LumiGLO Chemiluminescent Substrate (Cell Signaling Technol- 
ogy, USA). 

Cryo-Electron Microscopy 

Sample was placed on holey carbon grids (Quantifoil GmbH, Germany) and 
plunge frozen into liquid ethane cooled down to LN 2 temperature using a Vitro- 
bot MK4 (FEI Corp, the Netherlands). Frozen grids were stored in LN 2 and 
directly observed in a Titan Krios (FEI Corp, the Netherlands) operated at 
300 kV and quipped with a K2 Summit direct electron detector (Gatan, Pleas- 
anton, CA). All images were acquired in a single 2 day session at a defocus 
range of 0.5-1 .5 |.im. Images were recorded in dose fractionation mode, with 
a dose rate 3-4 e“/pix/s, exposures per image sub-frames between 1 and 
1 .5 e~/A^ and a cumulative dose for the entire image series of 30 e“/A^. The 
final pixel size for the resulting 3838x3710 pix^ images was 1 .0 A/pix. 

Image Processing and 3D Reconstruction 

Alignment for beam-induced movement was performed by 2 dx_automator 
(Scherer et al., 2014) that provides on-the-fly drift correction based on the al- 
gorithm implemented by Li et al. (2013). Images recorded as movie data in 
“counting mode” were drift corrected with the algorithm by Li et al. (2013). 



The quality of the images drastically improved after drift correction, especially 
at high resolution (Figure SI A). Drift on the order of 1 0 A could be fixed and re- 
sulted in Thon rings up to 3-3.5 A. All recorded frames up to 30 e~/A^ were 
used, and no weighting was performed. From the recorded ~250 images, 
the best 77 were selected based on ice thickness and the quality of the 
Thon rings. Contrast transfer function (CTF) determination was performed by 
CTFFIND3 (Mindell and Grigorieff, 2003). This led to exclusion of one image, 
due to a poor fit between the theoretical and observed Thon ring pattern. 
The images were then multiplied by the estimated CTF in SPIDER to both 
correct phases and to improve the SNR. Filaments were boxed using the 
e2helixboxer function within EMAN2 (Tang et al., 2007), using a box width of 
600 pixels for the initial alignment and 384 pixels for the final reconstruction. 
The SPIDER software package (Frank et al., 1996) was used for most subse- 
quent steps. From the long boxes, overlapping segments were cut that were 
600 pixels long with a shift of 30 pixels between boxes, where the shift (yielding 
95% overlap) was chosen to be ~1.5x the axial rise per subunit. A total of 
10,203 segments were obtained. The segments were then padded to 600 x 
600 pixels and decimated to 200 x 200 pixels size (3 A/pix) for initial align- 
ments and reconstruction using IHRSR (Egelman, 2000). Once these were re- 
constructed, the original images were subsequently decimated to 300 x 300 
pixels for further processing that included out-of-plane tilts. Finally, the initial 
boxes were windowed to 384 x 384 pixels for several cycles of IHRSR with 
1.0 A/pix until convergence. At the end of each iteration, helical symmetry 
with a rise of 21.8 A and a rotation of 29.4 degrees with C6 symmetry was 
applied. Class averages for Figures 1 D and SI F were generated using Spring 
(Desfosses et al., 2014). 

To test reproducibility of the atomic model building, a completely indepen- 
dent EM-density map was generated, starting from the initial micrographs fol- 
lowed by independent particle picking using the e2helixboxer function within 
EMAN2 (Tang et al., 2007). Square boxes of 400 A length (1 A/pix) were picked 
with a step of 30 A. Iterative real space helical reconstruction (Egelman, 2000) 
was performed with Spring (Desfosses et al., 2014), starting with a featureless 
cylinder as an initial model. At the end of each iteration, C6 symmetry was 
applied to the reference. All segments were processed as one data set, and 
resolution estimated by Fourier Shell Correlation between the half data sets 
was 4.3 A (FSC = 0.5). 

Atomic Model Building 

Model building was done de novo, with initial models of a single subunit built 
first, and then the system was refined in a symmetrical complex with all the in- 
teracting subunits present. A model of a single-subunit VipA/VipB was manu- 
ally built in Coot (Emsiey et al., 2010), guided by an initial partial model from 
Buccaneer (Cowtan, 2006), which placed a total of 51 3 residues into the den- 
sity map. In parallel, automated model building was carried out independently 
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pBAD24 - - vipA AN - - vipB AC 



Figure 6. Handshake Domain Integrity Is Important for Sheath Dynamics 

(A) Sheath assembly was detected by fluorescence microscopy. Parental strain, V. cholerae with VipA-msfGFP fusion encoded in the native locus. Deletion of 
vipB gene was complemented by expression of either WT vipB or vipB lacking C-terminal p strand (VipB-AC) from pBAD24 plasmid. 15 x 10 ^im fields of cells are 
shown. Bar is 1 |.im. See also Movie S2. 

(B) Expression of VipB or VipB-AC was detected in the indicated strains prepared as forthe imaging shown in (A) by western blotting using VipB-specific antibody. 

(C) Level of E. coli killing on a plate was measured for indicated strains after 3 hr of incubation at 1 0:1 ratio. Presence or absence of vipA or vipB on the chro- 
mosome is indicated by ” or respectively. Complementation was from pBAD24 plasmid carrying indicated genes. AN, vipA lacking N-terminal p strand; 
AC, vipB lacking C-terminal p strand. Data are represented as mean ± SD. 

(D) Sheath assembly was detected by fluorescence microscopy. Parental strain, V. cholerae vipA~. Deletion of vipA gene was complemented by expression of 
either WT vipA or vipA lacking N-terminal p strand (VipA-AN) from pBAD24 plasmid. 20 x 20 j.im field of cells shown. Bar is 1 jim. See also Movie S3. 

(E) Dynamics of sheath assembly for WT VipA (two examples, top) and VipA lacking N-terminal p strand (VipA-AN) (two examples, middle). An example of sheath 
contraction and disassembly is shown for VipA-AN (bottom). 



using a newly developed approach (Wang et al., 201 5). The automated method 
uses sequence-derived backbone conformations with side-chain density fit to 
accurately assign sequence into density maps. Starting with a map segmented 
to containing a single subunit, the computational method was able to place 
466 residues into the density. 

The two independently derived models showed reasonably good agree- 
ment: 394 residues were assigned in both models with a Ca RMSD of 
1.05 A. However, there were parts of the protein assigned in each model 
that were unassigned in the other. Thus, to build and refine the final model, 
we used RosettaCM (Song et al., 2013), a comparative modeling protocol 
that assembles protein structures by recombining portions of several models; 
in this case, the inputs were the two independently traced models. RosettaCM 



was guided by experimental density data, with agreement to the density map 
as an additional score term while building and refining models. A total of 1 ,000 
models were generated, and a best model was selected based on the all-atom 
energy plus the “fit to density” energy. 

Using this model, a final refinement step was carried out in the context of the 
symmetrical assembly, improving model geometry and relieving clashes at the 
symmetric interfaces (DiMaio et al., 2015). The final model shows very good 
agreement to the density, with 504 of 558 traced residues matching the map 
with real-space correlations of 0.60 or greater (using densityjoois in Rosetta), 
and very good model geometry, with 0.36% Ramachandran outliers, 0% 
rotamer outliers, a Molprobity clash score of 2.15, and an overall Molprobity 
score of 1.38 (Chen et al., 2010). 



Cell 160, 952-962, February 26, 2015 ©2015 Elsevier Inc. 959 









Cell 



To test for overfitting during model building, we uniformly perturbed the final 
model and refined it against the independently generated EM map. A long 
refinement cycle (1,000 cycles of backbone rebuilding) was used to ensure 
the refined model is unbiased from the model fit to the original reconstruction. 
The resulting model had 0.34 A Ca RMSd to the original model. 

Atomic B factors were capped to 600 for heavy atoms and to 720 for H 
atoms. Methionine in position 1 of VipA was not included in the model due 
to a lack of EM density and evidence from mass spectrometry analysis of iso- 
lated sheath (data not shown) that it is not present on the N terminus. 

Molecular Analysis 

Interaction energy was calculated using PISA (Krissinel and Henrick, 2007). 
Secondary structure prediction for the Figures S2A and S2B was performed 
by Jnet (Cole et al., 2008). Structural alignments were performed by RaptorX 
(Wang et al., 2013), and the RMSD presented in the text are calculated from 
these alignments. Structural homologs were found using PDB Structure Navi- 
gator (http://pdbj.org/strucnavi). 

Evolutionary Constraints 

Evolutionary constraints were generated by the Gremlin server (http://gremlin. 
bakerlab.org/) (Ovchinnikov et al., 2014) or FreeContact software (Kajan et al., 
2014). All reliable constraints with scores over 1.5 are listed in Table SI. The 
distance in 3D was measured between the weighted centers of mass of the 
contacting residues. The distance was also estimated between the contacting 
residues in the neighboring protomers, and in case the inter-protomer distance 
was less than intra-protomer distance, the inter-protomer distance was used 
in Table SI. This was implemented using Matlab (Mathworks). 

Coloring of the EM maps was done with Dynamo package for electron tomo- 
graphic image processing (Castano-Diez et al., 2012). The visualization of 
atomic models, evolutionary constraints, and rendering of the Movie SI was 
performed in UCSF Chimera (Pettersen et al., 2004). 
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Perivascular TGF-p suppresses 
proliferation but promotes invasion and 
heterogeneity in squamous cell 
carcinoma stem cells. These TGF- 
P-responding cells reprogram anti- 
oxidant metabolism and resist anti- 
cancer therapy, leading to tumor 
recurrence. 
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• We devise a system to monitor, manipulate, and track TGF-p GSE64867 
signaling in SCCs in vivo 

• Perivascular TGF-p causes signaling-based heterogeneity 
among SCC stem cells 

• TGF-p slows proliferation but aids in malignancy and anti- 
oxidant metabolism 



• TGF-p-responding cells resist anti-cancer therapeutics, 
leading to tumor recurrence 
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SUMMARY 

Subsets of long-lived, tumor-initiating stem cells 
often escape cancer therapies. However, sources 
and mechanisms that generate tumor heterogeneity 
and drug-resistant cell population are still unfolding. 
Here, we devise a functional reporter system to 
lineage trace and/or genetic ablate signaling in 
TGF-p-activated squamous cell carcinoma stem 
cells (SCC-SCs). Dissecting TGF-P’s impact on 
malignant progression, we demonstrate that TGF-p 
concentrating near tumor-vasculature generates 
heterogeneity in TGF-p signaling at tumor-stroma 
interface and bestows slower-cycling properties to 
neighboring SCC-SCs. While non-responding proge- 
nies proliferate faster and accelerate tumor growth, 
TGF-p-responding progenies invade, aberrantly 
differentiate, and affect gene expression. Intrigu- 
ingly, TGF-p-responding SCC-SCs show increased 
protection against anti-cancer drugs, but slower- 
cycling alone does not confer survival. Rather, 
TGF-p transcriptionally activates p21, which stabi- 
lizes NRF2, thereby markedly enhancing glutathione 
metabolism and diminishing effectiveness of anti- 
cancer therapeutics. Together, these findings estab- 
lish a surprising non-genetic paradigm for TGF-p 
signaling in fueling heterogeneity in SCC-SCs, tumor 
characteristics, and drug resistance. 

INTRODUCTION 

Most tumors are of a clonal origin but often exhibit heterogeneity 
in phenotypic and functional properties including proliferation, 
morphology, motility, and differentiation. Such heterogeneity 
has also been implicated in the ability to survive therapy and 
seed metastases (Hanahan and Weinberg, 2011). Cumulative 
mutations resulting from genomic instability certainly produce 
heterogeneity (Greaves and Maley, 2012). However, develop- 
mental diversity of cell types is accomplished without genetic 
alterations, raising the possibility that cellular diversity within tu- 
mors may also arise from non-genetic factors. Contributing var- 
iations might come from the tumor microenvironment, which can 
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transmit gradients of signaling factors, oxygen, and metabolites 
to tumor cells depending upon their proximity to the local sour- 
ces (Meacham and Morrison, 201 3; Kreso and Dick, 201 4). While 
the hypothesis is attractive, experimental evidence is lacking, 
and non-genetic mechanisms that drive tumor heterogeneity 
remain largely unknown. 

Irrespective of the basis for tumor heterogeneity, the long- 
lived capacity of tumor-initiating stem cells (SCs) to self-renew, 
initiate, and propagate cancers place these cells at the roots of 
diversity. Furthermore, SCs are often few in number and can 
exist in slow-cycling states, which has led to speculation that 
cancer SCs may be the source of recurrence following anti-can- 
cer therapy (Hope et al., 2004; Berns, 2005; Notta et al., 2011; 
Visvader and Stingl, 201 4). Another potentially intertwined factor 
is the need for long-lived SCs to adjust their metabolism in order 
to withstand stress and reactive oxygen species (ROS) (Diehn 
et al., 2009). In turn, such metabolic reprogramming can alter 
cellular behavior and lead to cancer progression (Bigarella 
et al., 2014). To this end, variations in cycling rates and/or local 
microenvironments could generate metabolic heterogeneities 
in cancer SCs, which could ultimately affect tumor heterogeneity 
and drug resistance. 

An excellent tumor model for addressing these issues is squa- 
mous cell carcinoma (SCC). Among the most common and life- 
threatening cancers world-wide, SCCs exhibit high rates of 
tumor recurrence following anti-cancer therapy. Both function- 
ally and molecularly, populations enriched for SCC-SCs have 
been identified, purified, and characterized. These SCC-SC-en- 
riched populations represent ~1 %-5% of the tumor and reside 
at the tumor-stroma interface. They are typified by elevated in- 
tegrins, and other markers, e.g. CD34, CD44, and SOX2 (Malan- 
chi et al., 2008; Schober and Fuchs, 2011; Lapouge et al., 2012). 
They also express VEGFA, suggestive of enrichment at the 
vasculature (Beck et al., 2011). Interestingly, heterogeneity, 
particularly in proliferative rates, exists within SCC-SC-enriched 
populations (Schober and Fuchs, 201 1). Whether a slow-cycling 
property allows some SCs to escape chemotherapy and 
contribute to cancer recurrence has not been explored. 

Notably, SCC-SC numbers increase by ~1 0-fold when TpRII, 
an essential component of the transforming growth factor p 
(TGF-p) transmembrane receptor, is abrogated (Schober and 
Fuchs, 2011). TGF-p is a well-established inhibitor of normal 
epithelial cell proliferation, and conditional ablation of Tgfbr2 
predisposes epithelial tissues to cancer (Lu et al., 2006; Ijichi 
et al., 2006; Munoz et al., 2006; Guasch et al., 2007). 
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Paradoxically, although elevated TGF-p signaling In skin pre- 
vents chemical Induction of benign papillomas, TGF-p enhances 
their malignant conversion to SCCs and promotes metastasis 
(Cui et al., 1996; Massague, 2012). 

Researchers often attribute these seemingly distinct effects 
of TGF-p to cumulative genetic changes during tumorigenesis. 
Flowever, since cycling rates of SCs are heterogeneous within 
an see and since SC numbers increase in the absence 
of TGF-p signaling, we posited that heterogeneity in TGF- 
p-responsiveness might exist within SCC progenitors, and 
might simultaneously restrict their proliferation and promote 
invasion and malignant transformation. If so, TGF-p-medlated 
differences In cycling rates of SCC-SCs could contribute to 
metabolic heterogeneity, as well as ultimately, heterogeneity 
In response to anti-cancer therapies. Elucidating how TGF-p 
functions In cancer progression and metastasis is a prerequisite 
for ascertaining whether disrupting this pathway is prudent for 
metastatic therapeutics when its tumor suppressive features 
might co-exist. 

The TGF-p signaling pathway has been extensively studied. 
When latent TGF-p ligands are processed and activated, they 
can bind to TpRII, which phosphorylates TpRI, the other essential 
component of this bipartite transmembrane receptor. Activated 
TpRI propagates the signal by phosphorylating Intracellular 
downstream effectors, SMAD2 and SMAD3 (SMAD2/3), which 
complex with SMAD4, translocate to the nucleus and bind to 
specific DNA sequence motifs called SMAD-bIndIng elements 
(SBEs). Upon binding, pSMAD2/3-SMAD4 complexes Interact 
with additional transcriptional regulators to transactivate TGF- 
p-responslve target genes (Massague, et al., 2005; Mullen 
etal.,2011). 

The ability of TGF-p signaling to activate target genes enables 
the pathway to Impact diverse cellular processes including not 
only proliferation but also differentiation, migration, apoptosis, 
and ECM remodeling (Derynck and Miyazono, 2008; Massague, 
2012; Oshimori and Fuchs, 2012a). Important questions now 
emerge regarding TGF-p’s ability to unleash its varied and tem- 
poral effects on tumor progression. Do TGF-P’s seemingly 
opposing actions on proliferation and Invasion act sequentially 
or do they act simultaneously in tumor progression? Are these 
dissimilar events dependent upon progressively distinct genetic 
states that emerge during malignancy? Does TGF-p contribute 
to heterogeneity in the tumor microenvironment and, if so, 
how? Can heterogeneity in TGF-p signaling impact SCC-SCs 
differentially and might this allow some cancer SCs to escape 
anti-cancer drugs? If so, is it because of its ability to impact pro- 
liferation rates, affect metabolic states and/or alter the expres- 
sion of key target genes? 

T o tackle these Issues, we devised a strategy to monitor, track, 
and modify TGF-p signaling in mouse skin during malignant pro- 
gression. In so doing, we’ve been able to delineate the temporal 
functions of TGF-p In SCCs as they develop and progress. Com- 
bined with transcriptional profiling, molecular, biochemical, 
and genetic studies, we unearth important functions for TGF-p 
signaling during the process and unveil its impact not only on 
cancer SC proliferation but also on the emergence of tumor 
heterogeneity and anti-cancer drug resistance. Moreover, we 
show that metabolic reprogramming, an emerging hallmark of 



cancer, is also Integrally linked to TGF-p-medlated effects on 
cancer SCs, and that TGF-p-regulated metabolism in particular 
plays a critical role In the divergent responses to anti-cancer 
therapies. 

RESULTS 

An In Vivo Reporter System for Lineage Tracing and 
Manipulating TGF-p Responding Cells During Malignant 
Progression 

To Identify putative TGF-p-responding cells within skin tumors, 
we first performed antl-phospho (active) SMAD2 Immunofluo- 
rescence on mouse skin at various stages following classical 
carcinogenic protocols with tumor-lnitlator 7,12-dlmethyl- 
benz(a)anthracene (DM BA) and tumor-promoter 1 2-0-tetrade- 
canoyl-phorbol-1 3-acetate (TPA) (Figure 1A). As reported previ- 
ously (Oshimori and Fuchs, 2012b), nuclear pSMAD2 was barely 
detectable In Interfolllcular epidermis. In normal skin. It appeared 
transiently In follicular stem cells (SCs) at the start of a new hair 
cycle. As benign papillomas formed, pSMAD2 immunolabeling 
remained low In epithelium but was found In some stromal cells. 
As papillomas transitioned to malignant SCC, marked nuclear 
pSMAD2 appeared In epithelial cells at the Invasive tumor front. 
Keratin 1 4 (K1 4)-Cre-medlated ablation of Tgfbr2 specifically in 
skin epithelium resulted In complete loss of pSMAD2 In SCCs, 
but not surrounding stroma. These findings underscored the ef- 
ficacy of the antibody and the dependence of SMAD2/3 activa- 
tion in SCCs on TGF-p/TpRI/ll signaling, rather than pathways 
triggered by Nodal or Activins. 

To monitor TpRI/ll-pSMAD2 signaling in vivo, we designed 
a lentiviral (LV) TGF-p reporter system that used an enhancer 
composed of multimerized pSMAD2/3 binding elements (SBE) 
to drive a P2A-based bicistronic transcript encoding nuclear 
(NLS) mCherry and tamoxifen (Tam)-activatable CreER recom- 
binase (herein called TGFp-CreER). We Inserted a polymerase 
Ill-driven promoter In the opposite direction to simultaneously 
drive an shRNA to achieve knockdown (KD) of a desired tran- 
script (Figure S1A). We first tested this reporter in primary kera- 
tinocytes (1°MKs) cultured from Rosa26-lox-STOP-lox-EYFP 
Cre reporter {Rosa-YFP) mice bred on either a Tgfbr2*'* or 
Tgfbr2"'" background (Figure SIB). Upon TGF-p treatment, re- 
porter-transduced 1°MKs expressed NLS-mCherry. Tam then 
activated CreER in TGF-p-responding MKs, leading to constitu- 
tive Rosa-YFP expression. Upon Tgfbr2 ablation, cells lost TGF- 
P responsiveness, and In turn extinguished NLS-mCherry 
expression (Figure SI B). 

We brought the TGF-p reporter system Into an in vivo setting 
by injecting LV into the amniotic sac of E9.5 mouse embryos. 
This in utero method allowed titer-dependent (1 —>>90%) selec- 
tive transduction of the unspecified surface epithelial progenitors 
that give rise to skin epithella (Beronja et al., 2010). By adding a 
reverse tetracycline transactivator (rtTA) cassette [PGK-rtTAS] 
to our LV TGF-p reporter construct, we could transduce TRF- 
transgenic mice (Chin et al., 1999) and then Induce 
oncogenic FiRas°^^'^ with doxycycllne (Doxy) (Figure 1 B). 

In the first experiment, we crossed TRF-Hras1°^^'^ and TRF- 
H2BGFP mice, and sparsely delivered TGF-p reporter to surface 
ectoderm of transgenic embryos. Postnatally, doxy-dependent 
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Figure 1. Lentiviral TGF-3 Reporter System for Probing Malignant Transformation In Vivo 

(A) pSMAD2 immunolocalization in normal mouse skin and at different stages of DMBA-TPA-induced malignant progression to SCC. Integrin a6 denotes the 
boundary of tumor epithelia {Tu) and stroma (St). IFE: interfollicular epidermis, HF: hair follicle. 

(B) Schematic of LV-mediated in vivo TGF-p reporter and KO/KD system. NLS-mCherry and CreER are under the control of TGF-p signaling. shRNA and rtTA3 
transcription factor are under constitutive promoter regulation. LV transduction of surface epithelium of live E9.5 TetO-Hras^^^^ X Rosa-YFP embryos was 
achieved by in utero ultrasound-guided microinjection into the amniotic sac. Doxy-induction of HRas°^^'^ initiates tumorigenesis. When desired, CreER is 
activated by Tam to induce recombination-dependent Rosa-YFP. 

(C) Epifluorescence detection of TGF-p-pSMAD2 signaling in HRas^^^^ SCC. 

(D) Limit-dilution orthotopic transplantation of primary tumor basal cells ± TGF-p reporter activity (10^ and 10"^ cells; n=8, 10® cells; n=3). 

(E) Epifluorescent TGF-p reporter activity with pan-anti-TGF-p and anti-a6 immunofluorescence shows that basal tumor cells with high TGF-p reporter activity are 
juxtaposed to stroma with high TGF-p (right). Note heterogeneity demarcated by vertical dotted line. 

Scale bars, 50 iim. See also Figure SI. 



HRas°^^'^ induction occurred exciusively in LV-transduced 
ceils. Within 1-2 months, papillomas formed, which often rapidly 
progressed to SCC in all or part of the tumor. Tumor epithelia 
were GFP"^, reflecting their derivation from rtTA-expressing, 
LV-transduced cells (Figure SIC). Additionally, the low levels of 
virus used (MOI«1) suggests that these tumors were clonally 
derived. 

In remaining studies, we used TRE-Hras1°^^'^ X Rosa-YFP X 
Tgfbr2 (+/fl or fl/fl) mice. TGF-p reporter activity (NLS-mCherry) 
co-localized with pSMAD2 and was particularly intense in a sub- 
set of basal tumor cells at invasive fronts (Figures 1 C and S1 D). 
Serial transplantation assays previously revealed that the tumor- 
stromal interface is where cells exist that have long-term, tumor- 
initiating potential, defined as SCC-SCs (Schober and Fuchs, 
201 1 ; Lapougeet al., 2012). To address whether the heterogene- 



ity of TGF-p signaling at this interface might be relevant to 
SCC-SCs, we used fluorescence-activated cell sorting (FACS) 
to fractionate CD44‘^a6'^' basal tumor cells according to mCherry 
expression. mCherry"^ cells frequently expressed SC marker 
CD34 (Figure S1E). In vitro, the colony forming efficiency of 
TGF-p-responding basal cells was higher than non-responding 
counterparts, while in direct transplantation assays, FACS-puri- 
fied TGF-p reporter"^ basal cells displayed ~5X higher tumor- 
initiating frequency than reporter"®® counterparts (Figures ID 
and SI F). Together, these data indicate that the TGF-p-respond- 
ing subset of CD44'^CD34'^oc6^' basal tumor cells was enriched 
for SCC-SCs. 

Immunostaining revealed heterogeneity in TGF-p ligand distri- 
bution in the stroma that correlated well with basal tumor cell 
heterogeneity in TGF-p signaling (Figure 1 E). Of the various cell 
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Figure 2. TGF-3 Signaling-Driven Lineage Tracing during Tumor Development 

(A) Experimental scheme and representative images of TGF-p signaling-driven lineage tracing. Once tumors form (>7 mm), NLS-mCherry is detected in TGF- 
p-responding transduced cells (Tam-). A single i.p. injection of low-dose Tam elicits Rosa-YFP recombination in a small subset of these cells within 2d post-Tam 
injection (2dpi). Note YFP/mCherry double-positive cell at invasive front. YFP-marked cells undergo clonal expansion, evident at 7 and 14 dpi, and now inde- 
pendent of TGF-p reporter activity (right). Note difference in clonal expansion rate and morphology based upon whether the initial marked cell is from a tumor 
initiated on a Tgfbr2 (fl/+) (top) or (fl/fl) (bottom) genetic background. 

(B) Immunolabeling (left) and quantifications (right) show that suprabasal differentiation marker K10 preferentially marks YFP"^ clones (n=19 analyzed) derived 
from K1 4-CreER-induced basal tumor cells. Note that TGFp-CreER-induced lineage tracing marks TGF-p-responding basal cells that yield YFP"^K1 0"®^ clones 
(n=21). Data are mean ± SEM. 

(C) S-phase analysis during lineage tracing. BrdU or EdU was administered at 2 or 7 dpi, respectively (see Figures S2H and S2I), and YFP"^®® and YFP"^ basal tumor 
cells were quantified for nucleotide incorporation (n = 11-15). 

Data are box-and-whisker plots. Scale bars, 50 jim. See also Figure S2. 



types surrounding the tumor, TGF-p immunolabeling best 
overiapped with CD11b'^Ly6C'^ monocytic myeioid ceiis (Fig- 
ure S1 G), in agreement with a prior report that human peripherai 
biood monocytes secrete TGF-p (Grotendorst et ai., 1989). 
interestingiy, TGF-p"^ ceiis often localized near vascuiature, whiie 
nuclear pSMAD2 gave a complementary pattern in nearby 
SCC-SCs (Figures S1G and S1H). The spatiai reiation between 
epitheiiai TGF-p reporter activity and tumor vascuiature was 
exempiified by three-dimensionai (3D) microscopic imaging of 
the tumor-stromai interface (Movies SI and S2). These resuits 
suggest that TGF-p iigand distribution coincides with vascuia- 
ture and immune ceii heterogeneity in tumor microenvironment, 
and this in turn, generates regionai TpRi/ii-pSMAD2 signaling 
within nearby maiignant epitheiiai progenitors at the tumor- 
stroma interface. 



Lineage Tracing Unveils Distinct Behaviors of 
TGF-P-Responding Versus Non-Responding SCC-SCs 

To track the fate of TGF-p-responsive ceiis during eariy tumor 
progression in vivo, we performed TGF-p signaiing-dependent 
lineage tracing by transducing our TGF-p reporter at iow MOi 
as before and then administering Doxy at birth to induce tumor- 
igenesis. Once tumors reached ~7 mm in size, a single low-dose 
of Tam was then administered systemicaily to trigger ~24 hr of 
CreER activity in a smaii subset of TGF-p-reporter-activated tu- 
mor ceiis (diagram in Figure 2A). Prior to Tam injection, emerging 
tumors showed mCherry but no YFP, underscoring the depen- 
dency of mCherry/CreER bicystronic expression on TGF-p but 
the reiiance of CreER activation on Tam. 

At 2 days post-Tam injection (dpi), singie or smaii ciusters 
of 2-4 YFP"^ ceiis were found at the tumor-stroma interface 
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(Figure 2A). YFP'^TpRir clones grew markedly over the subse- 
quent 2 weeks and showed enrichment at invasive protrusions. 
Surprisingly, however, cells within these clones were highly scat- 
tered (shown). 

On the Tgfbr2^'" background, Tam resulted in Tgfbr2 ablation 
specifically in the few random TGF-p-responding tumor cells that 
had activated CreER and YFP. YFP+TpRir®SpSMAD2"®s cells 
underwent clonal growth faster, and clones were more tightly 
packed and less intrusive than TpRiF counterparts (Figure 2A). 
Importantly, because we ablated Tgfbr2 in TGF-p-responsive 
cells, this difference was not attributable to microenvironmental 
heterogeneity but rather to intrinsic SC differences arising from 
loss of TGF-p signaling. 

The increased scattering and invasiveness of clones derived 
from TGFp-responsive tumor cells was accompanied by several 
features typically associated with epithelial to mesenchymal 
transition (EMT), including an elongated cell shape, reduced E- 
cadherin, and enhanced ZEB2 and HMGA2 (Figures S2A-S2D, 
Movie S3) (Thuault et al., 2006; Flanahan and Weinberg, 2011). 
By 3D reconstruction of Z-stack images, the clonal nature of 
the expanding colonies was still discernable, as was their close 
proximity to tumor vasculatures (Movie S4). 

Serendipitously, additional functional lineage tracing unveiled 
a role for TGF-p signaling in generating aberrant differentiation 
during malignant progression. When Tam was given to tumors 
from K14-CreER X Rosa-YFP X TRE-Hras^^^'^ mice transduced 
with LV-rtTA3, many YFP"^ clones from total basal tumor cells 
(K14^7K5‘^) displayed a differentiation keratin K10 suprabasally 
(Figure 2B). By contrast, in TGFp-CreER-driven YFP"^ clones, 
K10 was rarely detected, and leading edge cells showed 
reduced K5 (Figure S2E). Conversely, K13 and K18, ectopically 
induced in skin SCCs (Nischt et al., 1988; Yamashiro 
et al., 2010), were readily detected in TGFp-CreER-driven YFP"^ 
clones, but not in most K1 4-CreER-driven ones nor in unmarked, 
TGFp-non-responsive basal tumor cells from TGFp-CreER 
animals (Figures S2F and S2G). 

As shown in Figures 2C and S2FI, fewer TGF-p-responding 
(YFP"^) basal cells were in S-phase relative to TGF-p non-re- 
sponding (YFP"®^) basal cells. Interestingly, however, as YFP"^ 
basal cells clonally expanded, their SCs remained less prolifera- 
tive, suggestive of a prolonged slower-cycling state within the 
TGF-p lineage. By contrast, mosaic YFP^TpRII"®® basal clones 
showed high cycling rates compared to their YFP"®^TpRir 
neighbors (Figures 2C and S2I). 

Taken together, our in vivo data provided compelling evidence 
that TGF-p signaling is directly responsible for generating a 
pool of slower-cycling SCC-SCs. Moreover, these data further 
suggest that TGF-p is involved in a non-genetic mechanism 
that underlies the emergence of tumor heterogeneity at perivas- 
cular regions and which leads to simultaneous invasiveness, cell 
dissemination, and aberrant differentiation, at the expense of SC 
proliferation and tumor growth. 

TGF-p Protects SCC Progenitors From Anti-Cancer 
Drugs 

It has long been suggested that slower-cycling SCs might be 
refractory to chemotherapeutic anti-proliferative cancer drugs. 
One of the most widely used anti-cancer drugs, cisplatin [c/s-di- 



amminedichloroplatinum (II)], is the standard chemotherapy for 
head and neck SCC, and has been used to treat advanced cuta- 
neous SCC. Flowever, tumor recurrence is a major problem. 

To test whether TGF-p signaling might be involved in drug 
resistance, we first conducted a series of in vitro experiments. 
We prepared FIRas^^^'^-expressing 1°MKs from Tgfbr2^'^' X 
Rosa-YFP mice and transduced them with TGF-p reporter- 
CreER. After TGF-pi ± Tam, YFP"®9TpRir and YFP-"TpRir®s 
isogenic MKs were co-cultured to assess phenotypes under 
identical conditions. As expected, TGF-pi caused reporter acti- 
vation and growth arrest in YFP"®^TpRlF, but not YFP’^TpRlT®® 
MKs (Figures 3A and 3B). 

Cisplatin exerts its cytotoxicity by forming DNA-cisplatin 
adducts that are recognized by DNA damage recognition com- 
plexes which trigger apoptosis (Kelland, 2007). Consistent with 
these effects, cisplatin caused apoptotic rounding and Y-FI2AX 
(marking DNA double-strand breaks) throughout proliferating 
cultures (Figures 3C and 3D). Interestingly, after exposure to 
TGF-pi for 24-36 hr, most YFP^^TpRir cells remained spread, 
with markedly reduced y-H 2AX compared to YFP’^TpRlT®® 
counterparts. Quantifications with active Caspase-3 (AcCasp3) 
indicated that TGF-pi pre-treatment significantly reduced 
cisplatin-induced apoptotic death (Figure 3E). Moreover, an anti- 
body recognizing DNA-cisplatin adducts showed preferential 
immunolabeling of YFP'^TpRII"®® MKs in cisplatin-treated cul- 
tures (Figure 3F). These results suggested that TGF-p signaling 
enables cultured SCC-SCs to better withstand cisplatin-induced 
apoptosis. 

TGF-P-Responding SCC Progenitors Are Responsible 
for Drug Resistance and Tumor Recurrence 

To test TGF-P’s protective qualities in vivo, we challenged 
tumor-bearing TGF-p reporter mice with systemic cisplatin. 
While few AcCaspS"^ cells were detected in saline-injected con- 
trol mice, cisplatin significantly increased apoptosis within skin 
tumor cells. Strikingly, fewer basal tumor cells with active TGF- 
p signaling were apoptotic (Figure 4A and Movie S5). 

If TGF-p signaling contributes to drug therapy failure, TGF- 
p-responding cells should remain during treatment and outgrow 
their non-resistant peers over time. To test this hypothesis, 
we conducted TGF-p reporter-driven lineage tracing on TpRlTFI- 
Ras°^^'^ tumors during systemic cisplatin treatment. To ensure 
that we were monitoring TGF-p reporter"^ progeny, we carried 
out Tam treatment (3 doses, 12 hr each) until 1.5 days prior to 
cisplatin administration. Relative to saline controls, cisplatin in- 
jections resulted in a striking reduction in tumor volume within 
10-12 days (Figure S3A). 

Most notably, the remaining tumors, some of which already 
showed recurrent growth (Figure S3A), were disproportionately 
maintained by YFP"^ progenies of TGF-p-responding cells (Fig- 
ures 4B and 4C). This was notable since recurrent tumors still ex- 
hibited TGF-p reporter activity, and since even though overall 
proliferative rates of YFP’^ SCC-SCs were elevated in these 
recurring tumors, the TGF-p reporter* cohort of basal tumor cells 
still remained slower-cycling relative to TGF-p-reporter"®® coun- 
terparts (Figures 4C and 4D). Finally, as in the primary tumor 
clones derived from TGF-p reporter* lineages, K10 was broadly 
absent in YFP* suprabasal cells, suggesting that recurring 
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Figure 3. Active TGF-3 Signaling in Basal Tumor Cells Increases Their Protection against Cisplatin-Induced Apoptosis 

(A) Experimental scheme to derive isogenic HRas°^™-expressing YEP^^TpRir and YFP'^TpRII™® MKs from 1° cultures. (Right) Epifluorescence of cultures ± 
TGF-pi . Note that only YFP'”*^T3Rir cells are TGF-p reporter"^ (mCherryT. 

(B) Growth curves of TpRIT and TpRII'"’® cells ± lOOpM TGF-pi. Data are mean ± SEM. 

(C) Immunofluorescence of BSA- or TGF-pi -pretreated (36 hr) YFP'"*^TpRir and YEP'^TpRH"'’® MKs treated 1 0 hr with cisplatin. Note that YEP’^TpRH"'’® MKs 
frequently displayed signs of apoptotic rounding. 

(D) y-H2AX detection of DNA double-strand breaks induced by cisplatin treatment. Note that both YFP™®TpRir and YEP^TpRH"®^ MKs show y-H2AX signal in 
control, but TGF-pi selectively spares TpRIT cells, which remain spread. 

(E) Quantifications of AcCaspS"^ cells in the same experiment as in (C) and (D) (n=1 5 microscopic image fields). Data are box-and-whisker plots. 

(F) Immunodetection of adduct formed between cisplatin and DNA. Note that YFP'^TpRII™® cells have more cisplatin-modified DNA (red). Anti-tubulin (white). 
Scale bars, 50 nm. 



tumors were enriched for malignant TGF-|3-pSMAD2-signaling 
basal cells that had survived cisplatin treatment (Figure 4E). 

If TGF-3 signaling confers increased cisplatin protection to 
SCC-SCs in vivo, then ablating TGF-(3 signaling in these cells 
should confer increased sensitivity to apoptosis. To test this hy- 
pothesis, we repeated the experiment shown in Figure 4B, this 
time on Tgfbr2*^^' (control) and Tgfbr2^'" (test) backgrounds. To 
ensure optimal Tgfbr2 allele targeting, we extended Tam treat- 
ments until 3d prior to cisplatin. As shown in Figure 4F, loss of 
TGF-p signaling resulted in enhanced cisplatin sensitivity. 

We obtained similar results when we transduced human SCC 
lines with TGF-p reporter LV and xenografted them in Nude mice 
(Figure S3B). Upon cisplatin treatment, TGF-p reporter"®^ human 
basal tumor cells showed greater sensitivity than their TGF-p re- 
porter"^ counterparts (Figure 4G). When the experiment was 
repeated with TGF-p inhibitor LY364947, a dramatic increase 
in cisplatin sensitivity was observed (Figure 4FI and S3C). 
Together, these findings suggest that both in mouse and human, 
TGF-p signaling is heterogeneous and increases protection of 
SCC-SCs against cisplatin. 

T ranscriptome Analysis Uncovers a Link between TGF-p 
Signaling and Glutathione Metabolism in SCC Stem 
Cells 

The ability of TGF-p signaling to confer enhanced survival 
to SCC-SCs of cisplatin-treated tumors was consistent with 
the cancer stem cell hypothesis for tumor recurrence. A priori. 



TGF-p’s power could arise from its impact on slow-cycling 
behavior and/or its effect on transcription. To gain further 
insights, we used FACS and RNA sequencing (RNA-seq) to 
purify, transcriptionally profile and compare TGF-p-responding 
(mCherry"^) versus non-responding (mCherry®®®) SC populations 
from >7 mm HRas^^^'^-induced tumors (Figures S4A-S4C). 
Our purification scheme was optimized by depleting stromal 
cell types and dead cells, and then positively selecting for high 
surface expression of markers of SCC-SCs. 

Independent duplicates yielded highly reproducible RNA-seq 
data (Figure 5A). Dendrogram analysis indicated that despite 
similarities across SCC-SC profiles, TGF-p reporter"^ samples 
clustered together and separated from the other clustered coun- 
terpart. We defined our TGF-p-responsive SCC-SC signature as 
genes whose transcripts at an FPKM > 1 were differentially ex- 
pressed by log2 fold change > |1 1 and which displayed a statisti- 
cal significance (p < 0.05, q < 0.05) across datasets. Our signature 
consisted of 632 up- and 478 downregulated genes (Figure S4D). 

Several noteworthy alterations were immediately evident in the 
TGF-p signature. Consistent with our findings thus far, cancer- 
related differentiation genes, e.g. Krt13, were upregulated. As ex- 
pected, there was significant overlap with prior signatures from 
purified SCC-SCs but independent of TGF-p status (Schober 
and Fuchs, 201 1 ; Lapouge et al., 2012). Notably however, Sox2, 
Pitxl, Vegfa and other genes typifying these SCC-SC signatures 
(Boumahdi et al., 2014; Siegle et al., 2014) were more enriched 
in theTGF-p reporter"^ subset than in the total basal SC population. 
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Figure 4. TGF-|3-Responding SCC-SCs Show Enhanced Drug Resistance In Vivo 

(A) Immunodetection of AcCaspS (green) and TGF-p reporter (red) in tumors from mice administrated with saline or cisplatin. (Right) Quantifications revealing that 
TGF-p signaling protected basai tumor celis from cispiatin-induced apoptosis. (3 tumors analyzed; >15 microscopic image fields per tumor). 

(B) Experimental scheme and representative examples of lineage tracing to monitor the fate of the TGF-3 reporter"^ subset of basal tumor cells after cisplatin 
treatment. (Right) Note that resistant SCCs are largely contributed by TGF-p-responding cells that survived cisplatin. 

(C) Quantifications show that the % of YFP* basal tumor cells increases after cisplatin, suggestive of their preferential survival. 

(D) Quantifications of basal tumor cells in G2/M phase (phospho-FI3*) that are YFP™® vs YFP* (left) and reporter'”’® vs reporter* (right). Note: although proliferation 
of basal tumor cells that resist cisplatin is generally elevated, TGF-p-responding SCC-SCs are still slower-cycling. 

(E) Recurring tumor clones that resist cisplatin are largely YFP* and express little K10, consistent with their derivation from TGF-p reporter* basal cells. 

(F) Quantification of apoptotic cells in cisplatin-treated tumors from LV-transduced Tgfbr2*'” or Tg1br2™ mice treated as in (B). Note enhanced survival of TpRII* 
progenies. 

(G and FI) Quantification of AcCasp3* cellsaftercisplatin treatment of xenografts of (G)TGF-p reporter transduced human SCC cells (reporter* vs reporter'”’®), and 
(H) xenografts pre-treated ± LY364947 (reporter* vs reporter®””) (n>3). 

Data are box-and-whisker plots (C-F) and mean ± SEM (G and H). Scale bars, 50 ).im. See also Figure S3. 



There was also overlap (28%) between ourTGF-p reporter®" SCC 
basal cell RNA-seq and the signature obtained from VEGF-over- 
expressing papilloma basal cells (Beck et al., 2011). 

Gene ontology (GO) term analysis provided insights into bio- 
logical processes enriched in TGF-p-responding SCC-SCs. In 
addition to epidermal development; lipid metabolism; and cell 
proliferation, “reduction and oxidation” (Redox) genes sur- 



faced among top GO-terms. This was especially interesting 
given that the top upregulated gene pathway in KEGG analysis 
was “glutathione metabolism” (Figure 5A). Notably, glutathione 
is the most abundant intracellular antioxidant in animal cells and 
involves two important metabolic processes (Figure 5B) (Lush- 
chak, 2012). The first is a reduction reaction, which prevents 
damage from reactive oxygen species (ROS) by exhausting 
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Figure 5. Transcriptional Profiling Reveals a Link between TGF-3 Signaling and Glutathione Metabolism 

(A) Summary of transcriptional profiling of TGF-f) reporter"^ vs reporfer'"’® fumor basal cells by RNA-seq. Significanfly upregulated genes are listed on the right 
side; note genes involved in glutathione metabolism (red), Redox (asterisk). Other genes relevant to text are bolded. (Bottom) Gene ontology (GO: biological 
function) and KEGG pathway analyses. 

(B) Schematic of glutathione metabolic pathway: 1) Reduction reaction; 2) Conjugation reaction. Genes in red are significantly upregulated inTGF-p reporter"^ SCs. 

(C) RNA-seq validation by qRT-PCR of independently derived in vivo tumor basal cell RNA samples. Data are mean ± SEM. 

(D) Flow cytometry analysis of ROS levels in basal cells in normal skin epidermis and tumor epithelia ± TGF-p reporter activity. See also Figure S4. 



ROS through the conversion of reduced glutathione (GSH) 
to its oxidized state (glutathione disulfide, GSSG). The second 
key process is a conjugation reaction regulated by glutathione 
S-transferase (GST), which is known to metabolize cisplatin 
(Kelland, 2007). 

Our RNA-seq data indicated that genes involved in GSH- 
conjugation, GSH-mediated reduction and GSH-recycling pro- 
cesses were broadly upregulated in TGF-(3 reporter* SOs. This 
list included GST genes {Gsta1-5, Gsto1) (Figure 5A). qRT-PGR 
on RNAs from independent tumor samples confirmed their 
enhanced expression in TGF-(3 reporter* basal tumor cells (Fig- 
ure 50). Moreover, as judged by CellROX green, a cell-permeant 
dye that is brightly fluorescent only upon oxidation by ROS, ROS 
levels were significantly lower in TGF-fi reporter* tumor cells, 
consistent with the high expression of genes involved in gluta- 
thione metabolism (Figure 5D). 

TGF-3 Induces p21 which In Turn Stabilizes and 
Activates NRF2-Dependent Transcription in SCC-SCs 

The myriad of glutathione metabolism genes upregulated 
in TGF-|3-responding SCO-SCs necessitated further insights 



before we could abrogate the pathway and assess the conse- 
quences to cisplatin resistance. Interestingly, the enhancer/pro- 
moters of these genes contained antioxidant response elements 
(AREs), which are the consensus binding motifs for transcription 
factor NRF2 (Gorrini et al., 2013) (Figure 6A). Additionally, as 
judged by immunofluorescence, nuclear NRF2 was prominent 
in TGF-|3 reporter* cells at the tumor-stroma interface (Figure 6B). 
However, neither RNA-seq nor qRT-PCR data showed TGF-(3 re- 
porter-dependent differences in NRF2 gene {Nfe2l2) transcrip- 
tion (Figure 6A), indicating that other steps must be involved in 
upregulating the antioxidant gene response in this SCC-SC 
subset. 

Probing mechanism, we first considered KEAP1, which nor- 
mally binds to and targets NRF2 for proteosome-mediated 
degradation. TGF-|3 activity did not affect Keap1 mRNA levels, 
diminishing the likelihood that KEAP1 absence might underlie 
NRF2 stabilization. By contrast, cyclin-dependent kinase inhibi- 
tor p21 purportedly competes with KEAP1 for NRF2 binding 
(Chen et al., 2009), and Cdknia, encoding p21 , is an established 
target of TGF-|3-SMAD signaling (Seoane et al., 2004; Koinuma 
et al., 2009). Indeed, RNA-seq and qRT-PCR showed Cdknia 
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Figure 6. TGF-P Target p21 Is Required for NRF2-Dependent Activation of Antioxidant Genes 

(A) Nucleotide sequence of NRF2 binding motifs within the 5'-upstream region of Gsf and other NRF2 target genes. Nucleotides in capital letters are those shared 
by the antioxidant response element (ARE) consensus sequence, (bottom) qRT-PCR analysis of in vivo tumor basal cell RNA samples. Data are mean ± SEM. 
(B and C) Co-expression of NRE2 or p21 (green) and TGE-p reporter (NLS-mCherry) at tumor-stroma Interface of TpRIT tumor sections. Eluorescent Intensities of 
NRE2 and p21 staining in TGE-p reporter"^ and reporter'"*^ cells were quantified (NRE2: n=78 and 57 cells, p21 : n=1 01 and 71 cells). Data are box-and-whisker 
plots. 

(D) Immunoblotting of lysates prepared from HRas°^™-overexpresslng TpRIT l“MKs stimulated with TGE-pi for Indicated times. 

(E) Immunoblotting of lysates prepared from HRas°^™-overexpresslng TpRIT and TpRir"® 1°MKs stimulated with TGE-pi for 36 hr. 

(E) Immunoblotting of lysates prepared from 36 hr TGE-pi -treated HRas°^^'^-overexpresslng 1°MKs transduced with scramble, Cdknia or Nfe2l2 shRNAs. 

(G) LV NRE2-ARE reporter. (Right) Immunofluorescence and Immunoblots of NRE2-reporter transduced HRas°^™-lnduced MKs expressing scramble (control), 
Cdknia, Nfe2l2, or Maff (control) shRNAs ± TGE-pi stimulation (36 hr). Note that ARE-reporter activity Is abolished upon Cdknia or Nfe2l2 but not control KD. 
Scale bars, 50 pm. See also Eigure S5. 



mRNA was upregulated in TGF-p reporter* SCC-SCs in vivo (Fig- 
ure 5A and 6A). Immunofluorescence further revealed a strong 
correlation between TGF-p signaling activity and p21 expression 
at the tumor-stroma interface (Figure 6C). Moreover, whereas 
NRF2 and p21 were readily detected in TGF-p-responding 
tumor cells at the tumor-stromal interface, their expression 
was not seen in neighboring TpRir®^ tumor cells derived from 
TGFp-CreER activation nor in early papillomas, which do not 
show appreciable TGF-p signaling (Figures S5A-S5C). In vitro 
studies corroborated these findings (Figure S5D). 

Delving deeper, we treated FIRas^^^'^-transformed 1°MKs 
with TGF-pi and then checked by qRT-PCR for temporal 



changes in levels of select mRNAs (Figures S5E and S5F). In 
contrast to Nfe2l2, whose transcripts remained constant during 
the experiment, Cdknia transcripts rose 1-3 hr after TGF-pi 
treatment, reaching ~5X normal levels by 48 hr. Moreover by 
immunoblot, NRF2 and p21 were elevated upon prolonged 
TGF-p stimulation (Figure 6D). These effects were abrogated in 
TpR||neg |_^pggGi 2 v -| 0 |^(^g (pigure 6E), supporting the view that 
p21 induced by TGF-p/pSMAD2 signaling mediates NRF2 pro- 
tein stabilization. 

To rigorously test the hierarchical relation between TGF-p, p21 
and NRF2, we used LVs harboring Cdknia and Nfe2l2 shRNAs 
(Figure S5G). Cdknia shRNAs not only efficiently depleted p21 
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Figure 7. TGF-3-lnduced Glutathione Metabolism Confers Enhanced Anti-Cancer Drug Resistance In Vivo 

(A) Coimmunolabeling of TGF-p reporter and GSTa in representative sections of SCCs from LV-transduced mice. Note that nuclear TGF-(3 reporter signal (red) 
and cytoplasmic GSTa (green) overlap in invasive cells at the tumor-stroma interface. 

(B) Immunofluorescence of TpRlT tumor sections from transductions of our LV reporter harboring scramble control or Cdknia shRNAs. p21 (green) 
correlates well with TGF-(3 reporter activity (red) in scramble control and is silenced by Cdknia shRNA expression. 

(C) Quantifications of GSTa immunofluorescence intensities of TGF-p reporter"^ and reporter"®^ basal tumor cells of mice transduced with scramble, Cdknia or 
Nfe2l2 shRNAs (n=144, 145 or 121 cells). 

(D) Quantifications of proliferation (pHS"^) and apoptosis (AcCaspS"^) in basal cells from HRas^^^^'^-derived SCCs of mice transduced with LVs harboring scramble 
control, Cdknia, or Nfe2l2 shRNAs. 

(E) Coimmunolabeling and quantifications (n=3 tumors, >16 microscopic image fields) of AcCasp3 (green, cytoplasmic) and TGF-p reporter activity (red, nuclear) 
in Cdknia or Nfe2l2 KD basal cells of SCCs from cisplatin-treated mice. Yellow arrowheads denote double-labeled cells. Note that without p21 or NRF2, TGF-p 
reporter"^ cells are no longer able to resist apoptosis in response to cisplatin. 

(F) TGF-pl -pretreated MKs were exposed to cisplatin ± a potent GST inhibitor, ethacrynic acid. Note that most YFP"®^TpRir MKs (red arrowheads) still exhibited 
robust TGF-p reporter activity, but now showed apoptotic rounding like their YFP'^TpRlT®® counterparts. 

(G and H) KEAP1 stabilizes NRF2 in TpRlF®^ SCC-SCs and renders them resistant to cisplatin. (G) Immunoblotting of lysates prepared from HRas*^^^'^-over- 
expressing 1°MKs transduced with scramble or Keapi shRNAs. (middle) qRT-PCR of NRF2-reporter {mCherry mRNA) expression, (right) NRF2 immunofluo- 
rescence (green) in either scramble- or Keapi shRNA-transduced (H2B-RFP'^) TpRlT®® cells. Note that NRF2 is readily detected InTpRlF®® cells only if transduced 

(legend continued on next page) 
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but also reduced NRF2 protein in TGF-(3 stimulated cells (Fig- 
ure 6F). By contrast, Nfe2l2 depletion did not affect p21 levels 
induced by TGF-p/pSMAD2 signaling. These data place p21 
downstream of TGF-p signaling and upstream of NRF2 sta- 
bilization. Indeed, as judged by immunoprecipitation, the inter- 
action between p21 and NRF2 was dependent upon TGF-pi 
(Figure S5H). 

To test the functional significance of this hierarchical relation, 
we created a LV NRF2 reporter (Figure 6G). NRF2 reporter activ- 
ity was potently induced in 1 °MKs not only by the classical ROS, 
hydrogen peroxide (FI 2 O 2 ) (Figure 35!) but also by TGF-p stimu- 
lation (Figure S5J). Comparative analyses showed that while not 
as robust as the maximal effect achieved with 500 pM H 2 O 2 , the 
effects of TGF-pi were comparable to 100-200 pM FI 2 O 2 (Fig- 
ure S5J). Moreover, NRF reporter activation by either H 2 O 2 or 
TGF-pi was abolished when these cells were transduced with 
LVs harboring Cdknia or Nfe2l2 shRNAs (Figures 6G and S5K). 
Together, these results provide compelling evidence that p21 
is required for the NRF2-mediated target gene expression that 
occurs downstream of TGF-p/pSMAD2 signaling in SCC-SCs. 

A Role for Glutathione Metabolism in the Cisplatin 
Resistance of TGF-P-Responding SCC-SCs 

To address the physiological relevance of the p21/NRF2 
pathway that we unearthed in vitro, we first showed that GSTa, 
one of the highly upregulated glutathione pathway genes, was 
indeed upregulated at the protein level in TGF-p reporter"^ cells 
at the tumor-stroma interface of invasive SCCs (Figure 7A). To 
test whether p21 mediates TGF-p-induced drug resistance, we 
conducted in vivo KDs by introducing Cdknia shRNAs into our 
LV TGF-p reporter constructs (Figure 7B). In Cdknia KD tumors, 
p21 expression was abrogated in TGF-p reporter"^ cells. Impor- 
tantly, GSTa was also downregulated upon Cdknia KD (Fig- 
ure 7C). Similar results were seen in tumors with Nfe2l2 KD 
(shown). Together, these findings showed that a key component 
of glutathione metabolism was dependent upon NRF2 and 
TGF-p-regulated Cdknia. 

Notably, the slow-cycling behavior of TGF-p-responding 
SCC-SCs was not affected appreciably by loss of either p21 or 
NRF2 (Figure 7D), providing a means of selectively reducing 
glutathione metabolism without compromising slow-cycling 
status in TGF-p-responding SCC-SCs. We therefore proceeded 
to address whether Cdknia and Nfe2l2 KD would affect their 
cisplatin resistance. In the scramble control, TGF-p-responding 
SCC-SCs showed little apoptosis. In contrast, when mice 
bearing tumors knocked down for either Cdknia or Nfe2l2 
were treated with cisplatin, many basal tumor cells were dou- 
ble-positive for AcCaspS and TGF-p-reporter (Figure 7E). Simi- 
larly, drug inhibition of GST increased the sensitivity of TGF- 
p-responding tumor cells to cisplatin in vitro (Figure 7F). Thus, 
under circumstances where SCC-SCs were still slow-cycling 
and responsive to TGF-p, the normalization of glutathione meta- 



bolism was sufficient to abolish the survival advantage of TGF-p 
responding basal cells to cisplatin treatment. 

Finally, we addressed the converse, namely whether by 
enhancing NRF2 stabilization, we could confer enhanced resis- 
tance of TpRII-null SCC-SCs to cisplatin. In vitro, both NRF2 
protein and NRF2 reporter activity rose in TpRII-null HRas°^^'^- 
transformed MKs transduced by either of two different Keapi 
shRNAs (Figure 7G). Correspondingly, this resulted in enhanced 
resistance to cisplatin (Figure 7FI). When equal numbers of non- 
transduced and either scramble- or Keapi shRNA-transduced 
TpRII-null HRas°^^''-transformed MKs were mixed and then en- 
grafted in vivo, tumors arose which displayed increased NRF2 
specifically in Keapi shRNA-transduced MKs (Figure 71). Most 
importantly, fewer of these basal tumor cells apoptosed after 
cisplatin treatment (Figure 7J). Overall, these data underscore 
the importance of this pathway in imparting enhanced drug 
resistance to TGF-p-responding SCC-SCs. 

DISCUSSION 

Functional and phenotypic heterogeneity among tumor cells 
have long been recognized, and dynamic changes in genetic, 
epigenetic, tumor microenvironmental and systemic factors 
affect subpopulations of tumor cells to acquire advantages for 
proliferation, survival, spread, and resistance to anti-cancer 
therapeutics. In studying stem cells of SCCs, we realized that 
they vary in cycling rates, and that SC numbers increase when 
TGF-p signaling is abrogated. These findings raised the possibil- 
ity that TGF-p signaling might be at the root of tumor heteroge- 
neity and that it might impact directly on cancer SCs. 

In the present study, we established an in vivo LV delivery sys- 
tem, which allowed us to address roles of TGF-p signaling in the 
early stages of tumor progression. Our findings provide compel- 
ling in vivo evidence of TGF-p’s contributions to the emergence 
of tumor heterogeneity in the tumor-initiating cells that drive 
see. The heterogeneity in TGF-p-responsiveness is rooted in 
the congregation of myeloid cells near the tumor vasculature. 
While TGF-p is secreted in a latent form and deposited in 
ECM, TGF-p can be activated and released by a variety of 
mechanisms, which include activated integrins. The paracrine 
activation of TpRI/ll-pSMAD2 in a subset of nearby (integrin^') 
SCC-SCs reflected active TGF-p signaling in these cells. 

TGF-p represses normal epithelial growth, thereby functioning 
as an early tumor suppressor. These effects have been exten- 
sively studied, as have TGF-p’s late-stage roles in metastasis. 
Flowever, in the absence of tools to explore intermediate stages 
of primary tumor progression, the prior speculation as to TGF-p’s 
dual function in these intermediate steps has been that TGF-p’s 
cytostatic effects are lost during tumor progression due to acti- 
vation of oncogenic pathways such as Ras-MAPK, PI3K and 
c-Myc, which then override TGF-p’s growth inhibitory effects 
(Chen et al., 2002; Seoane et al., 2004; Gomis et al., 2006; Bruna 



with Keapi shRNA. (H) Immunofluorescence of transduced YFP^TpRII"'’® HRas°’™-MKs treated with cispiatin for 10 hr. Note that Keapi KD (H2B-RFP'’') 
suppresses apoptotic rounding, (right) Quantifications of AcCaspS’’’ cells in the same experiment. 

(I) Anti-NRF2 (green) co-immunolabeling of Keapi KD (H2B-RFP'’') cells in TfiRII'"*® HRas®^™ allograft tumors. 

(J) AcCaspS"’’ quantifications show that Keapi but not scramble shRNA protects TpRir**^ HRas®^^'' allograft SCC-SCs against cisplatin. 

Data are box-and-whisker plots (C, D, H and J) and mean ± SEM (E and G). Scale bars, 50 nm. See also Eigure S6. 
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et al., 2007). At later stages, other TGF-p responses then pur- 
portedly prevail that are unrelated to TGF-p’s cytostatic effects 
but which favor tumor invasion and metastasis. 

By designing a functional TGF-p reporter system and coupling 
Cre activity to TGF-p signaling, we could monitor and compare 
the behaviors and fates of the subset of basal tumor SCs that 
activate TpRI/ll-pSMAD2 signaling (or which we blocked from 
doing so) in developing tumors expressing FIRas°^^'', commonly 
mutated in human skin cancers. Our lineage tracing analyses 
of the clonal growth of these cells revealed clearly that TGF-P’s 
cytostatic and tumor-promoting effects are not mutually exclu- 
sive. Rather, TGF-p-responding cells display morphological 
and biochemical features of invasiveness and malignant conver- 
sion during a period when they are in a slow-cycling proliferation 
state. Moreover, and most importantly, our findings show that 
these slow-cycling invasive tumor cells gain a marked advantage 
over hyperproliferative, more tightly clustered TpRII-null SCC- 
SCs in that they are better protected against DNA damaging 
agents such as cisplatin. 

Cisplatin resistance is a hallmark of head and neck SCCs. 
Suggested mechanisms for resistance include reduced drug up- 
take, increased drug efflux, inactivation by GSFI conjugation, 
increased DNA damage repair, and failure to induce apoptosis 
(Kelland, 2007). Another proposed route is the failure of anti-can- 
cer therapies to target slower-cycling cancer SCs (Meacham 
and Morrison, 201 3). Since our results show that TGF-p reporter"^ 
skin tumor cells are indeed slower-cycling, it seemed plausible 
that at least in part, the enhanced protection against anti-prolif- 
erative cancer drugs might be attributable to the slower-cycling 
status of TGF-p-responding SCs. Flowever as we learned, this 
was only the tip of the iceberg in what TGF-p responsiveness 
is able to achieve, and in fact slower-cycling status appears to 
be secondary to SCC-SC drug resistance. 

Given that the alterations provoked by TGF-p in SCC-SCs 
extended beyond cycling rates, we asked whether additional 
changes in transcriptomes might explain the enhanced resis- 
tance of TGF-p-responding SCC-SCs to cisplatin. The TGF-p 
signature encompassed genes already known to play a key 
role in sternness. Flowever, the signature also show-cased 
genes involved in glutathione metabolism. Delineating mecha- 
nism, TGF-p induced Cdknia transcription, leading to p21 -medi- 
ated NRF2 stabilization and induction of a cohort of glutathione 
metabolism genes. Most importantly, when Cdknia or Nfe2l2 
were knocked down, SCC-SCs were still responsive to TGF-p 
and were still slower-cycling, but their survival in the face of 
cisplatin was normalized. Further bolstering the importance of 
this pathway, KD of Keapi in TpRII-null SCs resulted in not 
only enhanced NRF2 target gene activity but also enhanced 
survival in cisplatin treated SCCs. 

The role of enhancing antioxidant reactions and glutathione 
metabolism is still obscure. Although inflammatory cells can 
have anti-tumorigenic roles, they can also release ROS, which 
is actively mutagenic, thereby accelerating the genetic evolution 
of nearby cancer SCs (Grivennikov et al., 2010). In the present 
study, we showed that even though immune cells concentrate 
near the vasculature, ROS levels are preferentially reduced in 
the subset of TGF-p-responding SCC-SCs, and that this is attrib- 
utable to the TGF-p/p21/NRF2 pathway we delineated here. 



In closing, since not all tumor-initiating cells localize to the 
perivasculature where TGF-p is concentrated, our studies 
show that this can be advantageous to the tumor. On the one 
hand, SOC-SCs that do not respond to TGF-p are faster-cycling 
and can greatly accelerate tumor growth and differentiation. On 
the other hand, TGF-p-responding SCO-SCs cycle more slowly 
but show enhanced invasiveness and increased glutathione 
metabolism, thereby increasing the likelihood not only of metas- 
tasis but also long-term survival in the face of ROS and anti-can- 
cer drugs. Given that perivasculature is an emerging niche for 
many types of SCs, it will be interesting to see in the future if 
the mechanisms we’ve unearthed here will extend to other can- 
cers. It will also be important to know whether these mecha- 
nisms are operative in human SCCs. In this regard, an analysis 
of the TCGA database revealed poor prognosis for patients 
with SCCs that upregulate this cohort of glutathione metabolism 
genes (Figure S6). Whether these tantalizing parallels between 
mouse and human SCCs will continue to hold must await future 
cancer databases on tumor-initiating cells as opposed to whole 
tumor samples. If so, these hitherto underappreciated roles for 
TGF-p signaling in tumor heterogeneity and anti-cancer resis- 
tance could serve as a foundation for designing chemotherapeu- 
tics that might overcome drug resistance for this devastating 
cancer. 

EXPERIMENTAL PROCEDURES 

In Vivo LV Transduction, Tumor Formation, and Drug Treatment 

LV production, concentration, and ultrasound-guided in utero transduction 
were performed as described (Beronja et al., 2010). Briefly, female mice at 
day 9.5 of gestation were anesthetized with isoflurane (Hospira). In utero, 

0. 5 |.il LV was microinjected into each embryo's amniotic sac. To induce tumor 
formation, rtTA3 was activated by feeding adult mice with Doxy (2 mg/kg) 
chow. Cre was activated by intraperitoneal (i.p.) injection of Tam (Sigma) in 
corn oil: TGFp-CreER, 25 |.ig/g low-dose, 3 X 1 00 i-ig/g high-dose; K1 4-CreER, 
17.5 i-ig/g. Cisplatin (Sigma) was dissolved in saline (1 mg/ml) and adminis- 
trated by i.p. injection (10 mg/kg). TpRI kinase inhibitor (LY364947, Tocris) 
was i.p. injected (1 mg/kg), 3X/wk. For limit-dilution transplantation and xeno- 
transplantation, 1 .0 X 1 0^-1 0® mouse primary tumor cells and 1 .0 x 1 0® human 
see cells were subcutaneously injected with Matrigel (BD) in Nude mice. 
Tumor size was calculated using the formula 4/37t x L/2 x W/2 x D/2. For 
cell proliferation analysis in vivo, BrdU (50 |.ig/g) or EdU (25 )ig/g) was injected 

1. p. 4 or 12 hr before lethal administration of CO 2 . All procedures were per- 
formed with lACUC-approved protocols. 

In Vitro Cell Culture Experiments 

Cells were cultured in E medium with 1 5% fetal bovine serum (FBS) and 50 |.iM 
CaCl 2 (1°MKs) or 1.5 mM CaCl 2 (human SCC lines) at 37°C, 7.5% CO 2 . For 
stimulation experiments, media were supplemented with either recombinant 
mouse TGF-pl (100 pM = 2.5 ng/ml, R&D Systems), 4-hydroxytamoxifen 
(1 jiM, Sigma), cisplatin (20 |.iM, Sigma), H 2 O 2 (1-1,000 i.lM, Fisher), or 
ethacrynic acid (50 ^iM, Abeam). For colony formation assay, FACS-isolated 
1 .0 X 1 0"^ a6^'CD44‘''mCherry'^®^ and a6^'CD44‘''mCherry''' primary tumor basal 
cells were plated onto mitomycin-treated mouse 3T3 fibroblasts in 6-well 
dishes in E medium with 15% FBS and 300 j.iM CaCl 2 . Colony number was 
counted after 14 day culture. 

Statistics 

Data were analyzed and statistics performed (unpaired two-tailed Student’s t 
test) in Prism5 (GraphPad). Significant differences between two groups were 
noted by asterisks or actual p values (*p < 0.05; **p < 0.01 ; ***p < 0.001). Quan- 
tification data were presented in mean value ± SEM or in box and whisker plots 
with the dimensions of the box encompassing the 25th-75th percentile, the 
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horizontal bar representing the median, and the error bars representing mini- 
mum and maximum values. 
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• Early death signaling predicts cytotoxicity days before cell 
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• Dynamic BH3 Profiling (DBP) is a functional measure of 
death signaling 

• DBP predicts response to targeted agents in vitro and in vivo 
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SUMMARY 

There is a lack of effective predictive biomarkers to 
precisely assign optimal therapy to cancer patients. 
While most efforts are directed at inferring drug 
response phenotype based on genotype, there is 
very focused and useful phenotypic information to 
be gained from directly perturbing the patient’s living 
cancer cell with the drug(s) in question. To satisfy this 
unmet need, we developed the Dynamic BH3 Pro- 
filing technique to measure early changes in net 
pro-apoptotic signaling at the mitochondrion (“prim- 
ing”) induced by chemotherapeutic agents in cancer 
cells, not requiring prolonged ex vivo culture. We find 
in cell line and clinical experiments that early drug- 
induced death signaling measured by Dynamic BH3 
Profiling predicts chemotherapy response across 
many cancer types and many agents, including com- 
binations of chemotherapies. We propose that 
Dynamic BH3 Profiling can be used as a broadly 
applicable predictive biomarker to predict cytotoxic 
response of cancers to chemotherapeutics in vivo. 

INTRODUCTION 

A fundamental challenge across medicine Is to assign to a pa- 
tient the drug or combination of drugs that will be of greatest 
benefit. In oncology, this choice has historically been driven by 
the anatomic location and histology of the tumor. Later, thera- 
peutic decision-making was assisted by immunohistochemistry, 
cytogenetics, and flow cytometric analysis of cell surface anti- 
gens. In more recent years, there are examples where gene 
expression signatures and specific genetic alterations have 
been essential to therapeutic decisions (Chapman et al., 2011; 
Paez et al., 2004). However, true personalization of therapy re- 
mains an elusive goal in most cases. In all too many cases, can- 
cer patients show little benefit from therapy. Moreover, it is likely 
that many tumors have unrecognized sensitivity to agents for 
which there is simply no useful predictive biomarker to inform 
therapy decisions (Garraway and Janne, 2012; Haibe-Kains 
et al., 2013). In this era of growing therapeutic options, there is 
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a comparable growing need for predictive biomarkers (Sawyers, 
2008; Yaffe, 2013). 

A feature common to nearly all of the biomarkers in use or in 
development in oncology is that they are studies performed 
on dead cancer cells. They are attempts to predict cancer cell 
behavior based on detailed analysis of components of the cell, 
such as DNA, RNA, or proteins (Barretina et al., 2012). In some 
cases, abnormalities in single genes are studied. There are spec- 
tacular examples of success with this approach, such as the use 
of EGFR mutations to guide treatment with EGFR inhibitors in 
lung cancer (Paez et al., 2004), BRAF mutations to guide treat- 
ment with vemurafenib in melanoma (Chapman et al., 201 1), or 
cKIT mutations to guide treatment with imatinib in GIST (Joensuu 
et al., 2001). However, most drugs in development or approved 
for cancer lack a simple genetic predictor, which impedes their 
clinical development (Sikorski and Yao, 2010). One popular 
approach to this problem is to identify signatures based on 
huge amounts of information based on genomes, transcrip- 
tomes, or proteomes (Barretina et al., 2012; Garraway and 
Janne, 2012). These strategies are relatively early in develop- 
ment and their power remains to be seen. Despite the abun- 
dance of information these strategies provide, they still share a 
weakness: they are all studies of dead cancer cells. They lack 
a measure of cancer cell function or response to perturbation. 
Studies of complex systems in and out of biology are often 
greatly augmented by observations of responses to strategic 
perturbations. Here, we present results of strategic perturba- 
tions of cancer cells with drugs and their mitochondria with pep- 
tides in a strategy we call Dynamic BH3 Profiling (DBP). 

DBP interrogates the BCL-2 family of proteins that regulates 
commitment to the mitochondrial pathway of apoptosis, the pro- 
gram of cell death that is commonly used by cancer cells in 
response to most chemotherapeutic agents. The BCL-2 family 
of proteins controls mitochondrial outer membrane permeabili- 
zation (MOMP) (Certo et al., 2006; Chipuk et al., 2010). The 
effector proteins B/\X and BAK, when activated, oligomerize to 
form pores in the mitochondrial outer membrane that induce 
release of cytochrome c and the loss of mitochondrial trans- 
membrane potential, as well as release of SMAC/DIABLO and 
other proteins that trigger apoptosome formation, caspase acti- 
vation, and finally apoptosis (Kluck et al., 1997; Wei et al., 2001). 
These effector proteins can be activated by the BH3-only 
proteins BIM and BID (and perhaps PUMA), also known as 
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activators (Sarosiek et al., 2013). Both effectors and activators 
can be inhibited by the anti-apoptotic members of the family, 
inciuding BCL-2, BCL-Xl, MCL-1 and others (Certo et ai., 
2006). There is a fourth group of proteins, caiied sensitizers 
(comprising proteins iike BAD, BMP, NOXA, HRK, and others) 
that by themseives are not abie to induce BAX and BAK oiigo- 
merization but instead seiectiveiy inhibit the anti-apoptotic mem- 
bers of the famiiy, thus indirectly promoting MOMP (Letai et al., 
2002). The BH3 domain is a roughiy 20-amino acid amphipathic 
aipha heiix that is necessary for most of the hetero-dimeric 
interactions of BCL-2 famiiy proteins that reguiate apoptosis. 
Synthetic BH3 domain oiigopeptides can execute most of the 
pro-apoptotic functions of pro-apoptotic BCL-2 family proteins 
(Certo et ai., 2006). 

BH3 peptides are thus a convenient, titratabie reagent that can 
be expioited to systematicaiiy study mitochondriai readiness to 
undergo apoptosis. This understanding of the BCL-2 famiiy of 
proteins and their interactions aiiowed the deveiopment of the 
BH3 profiiing technique (Ryan et ai., 2010) that identifies cancer 
ceils’ selective dependence on anti-apoptotic proteins, and aiso 
measures overaii apoptotic sensitivity or “priming for death” 
(Deng et ai., 2007a). “Priming” is a measure of how ciose a ceii 
is to the threshoid of apoptosis. Proceduraliy, priming corre- 
sponds to the sensitivity of mitochondria to BH3 peptides. The 
more sensitive mitochondria are to BH3 peptides, the more 
primed they are. We have previousiy found that the state of 
“priming” prior to therapy was an exceiient predictor of chemo- 
therapeutic response in vivo (Ni Chonghaiie et ai., 201 1 ; Vo et al., 
2012). Differences in priming between cancer celis and normai 
tissues aiso provide an expianation for the therapeutic index of 
conventionai chemotherapeutic drugs that target ubiquitous ele- 
ments such as DNA and microtubules. 

The main principie of DBP is to expose cancer ceiis to short in- 
cubations with drugs of interest and measure whether the drug 
exposure induces an increase in priming compared to an un- 
treated controi. In this paper we use DBP to test the hypothesis 
that eariy death signaiing predicts cytotoxicity, even when the 
ceii death does not occur untii days after the death signaiing is 
measured. Our resuits support the modei that initiation of death 
signaiing is the main reguiator of eventuai commitment to ceii 
death. Moreover, we show that we can perform these measure- 
ments on primary patient cancer ceiis in a way that predicts ciin- 
icai response to therapy. 



RESULTS 

DBP Predicts Chemotherapy Sensitivity in Non-Small 
Cell Lung Cancer Cell Lines 

Our strategy rests upon the hypothesis that it is the initiation of 
death signaiing that distinguishes ceiis destined to be kilied by 
an agent from those destined to survive. 

We therefore rigorousiy tested the hypothesis that measure- 
ment of early death signaling by DBP (Figure 1A) couid predict 
a cytotoxic response that did not occur untii severai days iater. 
We first used non-smaii ceii iung cancer (NSOLO) ceii iines 
derived from PC9. This ceii iine has an exon 19 deietion in the 
EGFR gene rendering it sensitive to EGFR-specific tyrosine 
kinase inhibitors (TKI) iike eriotinib or gefitinib. P09GR was 
obtained by continuousiy exposing PG9 to increasing concen- 
trations of gefitinib (Ercan et ai., 201 0), seiecting for a T790M mu- 
tation in EGFR that renders it non-sensitive to gefitinib but stili 
sensitive to the mutant seiective EGFR TKI WZ4002 (Zhou 
et al., 2009). A third ceii iine, PC9WZR, was simiiarly seiected 
for resistance to WZ4002. It possesses an EGFR T790M muta- 
tion and a MAPK1 ampiification conferring resistance to both 
gefitinib and WZ4002. Fiowever, PC9WZR is sensitive to the 
combination of WZ4002 with the MEK inhibitor Ci-1040, by 
compieteiy biocking the MAPK pathway (Ercan et ai., 2012). 
This set of ceii iines provided a usefui initiai modei of differentiai 
sensitivity to targeted therapies upon which to test our strategy. 

We performed DBP on each of the ceii iines using a 16 hr 
treatment with gefitinib, WZ4002, CI-1040 or the combination 
WZ4002 pius CI-1040. Sixteen hours was chosen after empiri- 
caily testing 4, 8, 1 6, 24, and 48 hr as it was the earliest time point 
that reiiabiy provided a significant change in priming in PC9 
ceils treated with gefitinib. After testing severai BFI3 peptides, 
inciuding BiM, HRK, and PUMA BH3, we found that BIM BH3 
concentrations of 0.3 and 1 |j.M provided the most usefui dy- 
namic range (Figure 1 and Figure SI). Drug concentrations 
were chosen based on our and others’ prior experience and 
the dose required for a compiete biockade of the MAPK pathway 
(Ercan et ai., 2012; Ercan et ai., 2010). We observed an increase 
in priming induced in PC9 by gefitinib, WZ4002 and WZ4002 + 
CI-1040, as shown by the increase in BiM BH3-induced 
mitochondriai depoiarization (A% priming), in PC9GR ceiis, 
WZ4002, but not gefitinib, increased priming. In PC9WZR cells, 
only the WZ4002 + CI-1040 increased mitochondriai priming 



Figure 1. DBP Predicts Chemotherapy Sensitivity in PC9 Cell Lines 

(A) To perform DBP we obtain a single cell suspension from a cell line or a primary sample, and we expose the cells to the different drug treatments to be tested. 
After this incubation, we permeabilize, stain with the fluorescent dye JC-1 and expose the cells to different BH3 peptides that will promote mitochondrial de- 
polarization and MOMP, the ultimate event that triggers apoptosis. By comparing the non-treated cells with the treated ones, DBP will determine the A% priming 
for each agent and identify which are most effective to induce apoptosis in that particular sample. All this analysis is performed in less than 24 hr, minimizing 
ex vivo culture. 

(B) DBP was performed on three different PC9 cell lines: parental PC9, PC9GR (gefitinib resistant, T790M mutation present), and PC9WZR (gefitinib and WZ4002 
resistant, T790M mutation present), using a 16 hr incubation of: gefitinib 1 riM, WZ4002 100 nM, CI-1040 3 pM (MEK inhibitor), and WZ4002+CI-1040. Results 
expressed as A% priming (increase in priming compared to non-treated cells). Values indicate mean values ± SEM, at least three independent experiments were 
performed (n > 3). 

(C) Cell death measurements at 72 hr for the same cell lines under the same treatments by EACS using Annexin V/PI staining. Results are expressed as increase 
on cell death or A% cell death, compared to non-treated cells. Values indicate mean values ± SEM, at least three independent experiments were performed 
(n > 3). 

(D) Plot showing correlation between A% priming at 16 hr and A% cell death at 72 hr. ROC curve analysis at right. 

(E) Western blot analysis, showing changes in the BCL-2 family of proteins. See also Figures SI , S2 and S3. 
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(Figure 1 B). We next measured cell death at 72 hr for the same 
cell lines following the same treatments using FACS analysis of 
Annexin V and propidium iodide (PI) staining (Figure 1C). When 
we compared A% priming and A% cell death, we observed an 
excellent correlation between both measurements (Figure ID. 
left). The receiver operating characteristic (ROC) was also excel- 
lent, performing perfectly in this small number of tests (Figure ID, 
right). Note that DBP was performed at 1 6 hr when no significant 
cell death was evident, whereas cell death was analyzed more 
than 2 days later (Figure S2). Thus, the early priming increase 
measured by DBP provided accurate, drug-specific predictions 
about cytotoxicity even though the death took place days later. 

DBP should only be predictive if the mitochondrial apoptosis 
pathway is being engaged. To confirm this engagement, we 
analyzed PARP cleavage, as well as levels of BIM, BCL-2, 
and BCL-XL proteins following 24 hr of drug treatment. When 
cytotoxicity was observed, PARP cleavage was detected. In 
addition, cytotoxicity correlated with either increases in BIM, de- 
creases in anti-apoptotic proteins, or a combination of both 
effects, supporting the simultaneous participation of multiple 
BCL-2 family proteins in the determination of cell fate (Deng 
et al., 2007b; Faber et al., 2011) (Figure IE). 

In order to determine if this predictive capacity of DBP could 
be generalized to other NSCLC models, we treated six different 
NSCLC cell lines with gefitinib, WZ4002, AZD6244 (MEK inhi- 
bitor), BEZ235 (PI3K/mTOR inhibitor), and the combination 
AZD6244 + BEZ235, that was previously described to treat mu- 
rine lung cancers harboring the KRas G12D mutation (Engelman 
et al., 2008; Faber et al., 2009). We chose drug concentrations 
that had previously demonstrated in vitro cytotoxicity. Again, 
we compared the priming increase measured by DBP after 
1 6 hr of treatment with cell death observed at 72 hr (Figure S3A). 
Some of the cell lines analyzed had a tendency to show less 
cytotoxicity than would be expected by DBP for a few drugs. It 
is possible that measurement of cell death at longer time points 
would reduce such disagreements. Nonetheless, we observed a 
significant correlation between A% priming and A% cell death 
when all cell lines and treatments were considered (Figure S3B). 
To assess if DBP provided a useful binary predictor of cytotox- 
icity, we performed ROC curve analysis (Pencina et al., 2008). 
Typically, a random classifier would present an AUC of 0.5, while 
a perfect classifier would have a AUC of 1 . In this case, the area 
under the ROC curve is 0.895 (Figure S3C), comparing favorably 
with the ROC performance of many clinically used predictors 
(Burstein et al., 2011). 

Note that this analysis relies not simply on measurements of 
the baseline priming but rather on the degree to which drugs in- 
crease priming from that baseline. 

DBP Predicts Cytotoxicity in Breast Cancer Cells 

To test the generalizability of our hypothesis in a different type of 
cancer, we performed a similar set of experiments with five 
different human breast cancer cell lines treated with gefitinib, la- 
patinib (HER2 inhibitor), MK-2206 (AKT inhibitor), AZD6244 (MEK 
inhibitor), BEZ235 (PI3K/mTOR inhibitor), dinaciclib (SCFI 
727965, CDK inhibitor), ABT-888 (PARP inhibitor), and the com- 
bination AZD6244 + BEZ235, as previously described (Faber 
et al., 2009). Again we observed a significant correlation between 



A% priming after 16 hr of treatment and A% cell death at 96 hr 
(Figures 2A and 2B). The area under the RCC curve for this set 
of cell lines is 0.93 (Figure 2C), thus objectively DBP is an excel- 
lent binary predictor for breast cancer cell lines’ response to 
chemotherapy. 

Selecting the Optimal Kinase Inhibitor Using Dynamic 
BH3 Profiling 

In clinical practice, an important application of a potentially 
powerful, widely applicable predictive biomarker would be to 
choose from among a panel of possible therapies (Sawyers, 
2008). This is the central goal of what is currently commonly 
termed “precision medicine.” We hypothesized that if we could 
compare the death signaling induced by several different agents 
in a cancer cell, we could pick the ones that would work best. To 
test this principle, we selected ten different cancer cell lines, 
chosen simply by variety and availability. For drugs, we chose 
nine kinase inhibitors, for their diversity of targets and known 
in vivo activity. We chose kinase inhibitors because of their 
known use of the mitochondrial apoptotic pathway to kill cancer 
cells (Bhatt et al., 2010; Faber et al., 2011). Our question was, 
among these diverse cell lines and drugs, could DBP at an early 
time point be used to make individualized choices of the drugs 
most likely to kill each cancer cell line. 

For this purpose we selected drugs targeting either key mem- 
brane receptor tyrosine kinases like gefitinib (EGFR inhibitor), im- 
atinib (ABL inhibitor), lapatinib (HER2 inhibitor), PD1 73074 (FGFR 
inhibitor), and TAE684 (ALK inhibitor) or important intracellular 
serine/threonine kinases including MK-2206 (AKT inhibitor), 
PLX4032 (BRAF''®“^ inhibitor), AZD6244 (MEK inhibitor), and 
BEZ235 (PI3K/mTOR inhibitor). All of the compounds tested pre- 
viously demonstrated cytotoxicty in cancer cell lines and/or mu- 
rine cancer models, including hematological malignancies (Bhatt 
et al., 2010) and solid tumors (Maertens et al., 2013). We tested 
the panel of kinase inhibitors on several human hematological 
cancer cell lines: K562 (chronic myelogenous leukemia or 
CML), DFIL6 (diffuse large B-cell lymphoma), LP1 (multiple 
myeloma), DFIL4 (diffuse large B-cell lymphoma) and AML3 
(acute myelogenous leukemia). First, we performed DBP after 
16 hr exposure to the different treatments (Figure 3A). We 
compared the DBP results to cell death achieved at 72 hr, ex- 
pressed as A% cell death (Figure 3A). Each cell line demon- 
strated a distinct pattern of drug induced priming increase, a 
distinct fingerprint of pathway addiction just as there was a 
distinct pattern of cytotoxic response to the drug panel. Most 
importantly for our question, however, there was an excellent 
correlation of DBP with cytotoxicity days later (Figure 3B). For 
this set of hematological cell lines, predictive power of DBP 
was demonstrated by an AUC of the ROC curve of 0.83 (Fig- 
ure 3C). Note that DBP identified the agent causing greatest 
cytotoxicity in four out of five cell lines. In the one exception, 
LP-1 , there was little cytotoxicity induced by any of the drugs. 

We next examined the predictive capacity of DBP with several 
diverse human solid tumor cell lines: MCF7 (breast cancer), PC9 
(non-small cell lung cancer), SK-MEL-5 (melanoma), FICT116 
(colon carcinoma) and MDA-MB-231 (breast cancer). We 
exposed the cells to the different treatments for 16 hr and per- 
formed DBP (Figure 4A), comparing it with the cell death 
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Figure 2. DBF Predicts Chemotherapy Sensitivity in Breast Cancer Cell Lines 

(A) DBP was performed in five breast cancer cell lines: BT20, HCC1954, SKBR3, T47D, and HCC2218 showing different pattern of response to the treatments 

tested (16 hr incubation): (1) gefitinib 1 ^lM, (2) lapatinib 1 i.lM, (3) MK-2206 1 (4) AZD6244 1 i.lM, (5) BEZ2351 ^M, (6) dinaciclib 10 nM (SCH 727965), (7) ABT- 

888 5 jiM and the combination (8) AZD6244 + BEZ235. Results expressed as A% priming (increase in priming compared to non-treated cells). Values indicate 
mean values ± SEM, at least three independent experiments were performed (n > 3). Cell death measurements at 96 hr for the same cell lines under the same 
treatments by FACS using Annexin V/PI staining. Results are expressed as increase on cell death or A% cell death, compared to non-treated cells. Values 
indicate mean values ± SEM, at least three independent experiments were performed (n > 3). 

(B) Plot showing the significant correlation between A% priming at 16 hr and A% cell death at 96 hr. 

(C) ROC curve analysis. 
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Figure 3. Identifying the Optimal Treatment in Hematological Malignancies Using DBP 

We selected several drugs targeting either key membrane receptors: (1) gefitinib 1 ^ll\/1, (2) imatinib 1 pM, (3) lapatinib 1 pM, (4) PD173074 1 ^ll\/1 and (5) TAE684 
1 pM; or important intracellular kinases: (6) MK-2206 1 pM, (7) PLX4032 10 pM, (8) AZD6244 1 pM and (9) BEZ235 1 pM, and we tested them with several human 
hematological cancer cell lines: K562, DHL6, LP1 , DHL4, and AML3. 

(A) DBP (1 6 hr incubation) results expressed as A% priming and cell death measurements at 72 hr using Annexin V/PI staining expressed as A% cell death. Values 
indicate mean values ± SEM, at least three independent experiments were performed (n > 3). 

(B) Plot showing the significant correlation between A% priming at 16 hr and A% cell death at 72 hr. 

(C) ROC curve analysis shows ADC = 0.83. 
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Figure 4. Identifying the Optimal Treatment in Solid Tumors Using DBF 

We tested the same panel of kinase inhibitors on several human solid tumor cell lines: MCF7, PC9, SK-MEL-5, HCT116 and MDA-MB-231 . 

(A) DBP (1 6 hr incubation) results expressed as A% priming and cell death measurements at 72 or 96 hr (as indicated) using Annexin V/PI staining expressed as A 
% cell death. Values indicate mean values ± SEM, at least three independent experiments were performed (n > 3). 

(B) Plot showing the significant correlation between A% priming at 16 hr and A% cell death at 72/96 hr. 

(C) The ROC curve analysis has an ADC = 0.96, indicating that DBP is an excellent binary predictor for chemotherapy response in solid tumor cell lines. See also 
Figure S4. 
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Figure 5. DBF Is a Good Binary Predictor for Cell Lines 

(A) Compilation of Figures 1, 2, 3, 4 and S3 results, showing a significant 
correlation between A% priming and A% cell death for all cell lines analyzed. 

(B) The total area under the ROC curve is 0.89, indicating that is a good binary 
predictor for chemotherapy response in all the cell lines and treatments tested. 

observed at 72-96 hr (Figure 4A). In some cases, a 96 hr time 
point was required due to slow kinetics of cytotoxicity. Similarly, 
as observed for hematological malignancies, the different cell 
lines responded differently to the drugs tested, but a significant 
correlation between DBP and cytotoxicity was detected (Fig- 
ure 4B). SK-MEL-5 was the only one sensitive to PLX4032, as ex- 
pected for a BRAF''®°°^ expressing melanoma cell line, but was 
also sensitive to MEK (AZD6244) and PI3K/mTOR (BEZ235) inhi- 
bition, correlating with the cell death detected three days later, at 
96 hr. On the other hand, PC9, as shown previously (Figures 1 B 
and 1C), responded to gefitinib (Ercan et al., 2010; Faber et al., 
2011) but also to lapatinib and TAE684; correlating with cell 
death at 72 hr. For this set of solid tumor cell lines the AUC of 
the ROC curve was 0.96 (Figure 40). In three out of five cell lines, 
DBP clearly predicted the most cytotoxic drug. In the other two 
MCF7 and FICT1 16, there was nearly equal maximum response 
of the same two drugs in both DBP and cytotoxicity. 

Throughout this paper, we use loss of fluorescence from an in- 
dicator compound, JC-1 , that is sensitive to the electropotential 
gradient across the inner mitochondrial membrane. We have 
previously shown that this JC-1 signal provides a good surrogate 
for permeabilization of the outer mitochondrial membrane (Ryan 
et al., 2010). To verify that this surrogacy is maintained in DBP, 



we compared measuring MOMP by JC-1 or by efflux of cyto- 
chrome c as read on a flow cytometer (Ryan and Letai, 2013) 
(Figure S4). Cur results show good agreement between the 
two techniques, supporting the use of JC-1 fluorescence as a 
surrogate for MCMP in the context of DBP. 

To test the generalizability of the principle that early drug- 
induced priming changes predict eventual cytotoxicity across 
a wide variety of both solid and liquid cancers and a wide variety 
of agents, we combined the data of Figures 1 , 2, 3, 4 and Fig- 
ure S3. We observed that there is a significant correlation be- 
tween A% priming and A% cell death (Figure 5A). Note that liquid 
tumors in general have a greater cytotoxic response per change 
in priming, perhaps explained by the higher baseline mitochon- 
drial apoptotic priming we observe in hematologic cancer cell 
lines compared to solid tumor cell lines. In addition, the ROC 
analysis suggests that DBP could be a good binary predictor 
for cytotoxicity across a wide range of pathway inhibitors and 
cancer, with an AUC for the ROC curve of 0.89 (Figure 5B). These 
results suggest the most significant hurdle that must be cleared 
for a drug to cause cytotoxicity is simply the initiation of death 
signaling. Regardless of the pathway inhibited and regardless 
of the cell of origin of the cancer, early drug-induced death 
signaling predicts later cytotoxicity. 

Choosing the Best Treatment among Several Options 

A predictive biomarker can be used to identify the best therapy 
among many treatment options for a single patient. To test the 
ability of DBP to identify the most effective therapy among a 
myriad of treatment options we turned to an allograft melanoma 
model. Mouse melanomas harboring compound mutations in 
Sraf and Nf1 readily grow as allografts and are resistant to selec- 
tive BRAF inhibitors but sensitive to (combined) MEK/mTCRCI 
inhibition (Maertens et al., 2013). To ask whether DBP could 
discriminate among the in vivo efficacy of several therapies on 
the same tumor model, we exposed Braf/Nf1 mutant melanoma 
cells to different targeted agents for 16 hr: PLX4720 (a PLX4032 
analog that inhibits mutant BRAF''®°°^), PD0325901 (referred to 
as PD-901 , a MEK inhibitor), GDC-0941 (a PI3K inhibitor), and ra- 
pamycin (an mTCR inhibitor), as single agents or in combina- 
tion. Of all the treatments tested, PD-901 in combination with 
rapamycin induced the greatest increase in priming (Figure 6A). 
These findings correlate well with the preclinical data previously 
generated using this tumor model (Figure 40 Maertens et al., 
2013). More specifically, of all (combination) therapies tested 
in vivo, the PD-901 /rapamycin combination caused the greatest 
tumor shrinkage, as summarized in Figure 6B. Across all of the 
treatments, we observed a significant correlation between DBP 
results and the in vivo data obtained in the Braf/Nf1 mutant allo- 
grafts (Figure 60). These results suggest that DBP can be used 
as a predictive biomarker to select among treatment options to 
identify treatments that will provide best in vivo benefit. 

Identifying the Best-Responding Patients to a Single 
Therapy in a Patient Cohort 

Predictive biomarkers can also be used to stratify likelihood of 
response to a single therapy among many patients. This can 
be described as a companion diagnostic use. Flaving thoroughly 
supported the hypothesis that early death signaling detected by 



984 Cell 160, 977-989, February 26, 201 5 ©201 5 Elsevier Inc. 




Cell 




c 



CM 

O) 

o 



2n 




10 



Spearman r -0.89 
p (one tailed) 0.0062 

— I 1 1 1 

20 30 40 50 

A% Priming* 



DBP predicts cytotoxicity in vitro, it was important to test 
whether our tool can likewise discriminate between clinical sen- 
sitivity and resistance to anti-cancer therapies using primary pa- 
tient samples. We chose treatment of chronic myelogenous 
leukemia (CML) with imatinib as a first test of this principle. 
CML cells possess a t(9;22) translocation creating a BCR-ABL 
fusion protein that results in constitutively active ABL kinase ac- 
tivity. CML is typically sensitive to inhibitors of ABL kinase 
including imatinib (Sawyers, 1999). 

To demonstrate the correlation between imatinib’s inhibition of 
ABL and an increase in apoptotic priming, we treated two human 
CML cell lines with different concentrations of imatinib. After 
16 hr of treatment, we observed that the dephosphorylation of 
ABL, and its downstream target CRKL correlated with an in- 
crease in priming. Note that frank cell death began days later, 
at 72 hr (Figure S5). 

We treated bone marrow cells obtained from 24 CML patients 
for 1 6 hr with imatinib, performed DBP, and recorded the change 
in priming induced. Initial resistance to imatinib is very rare in 
CML, so we compared samples of patients who were newly 
started on imatinib, all of whom entered at least a complete he- 
matologic remission (“sensitive”. Figure 7A), with samples from 
patients obtained when they were known to be refractory to im- 
atinib (“resistant”. Figure 7A). Samples from patients that were 
sensitive to imatinib showed a significantly higher A% priming 
compared to those that did not respond (Figure 7A). We next 
tested the ability of DBP to segregate clinical sensitivity and resis- 
tance in a binary fashion with ROC analysis (Figure 7B). The area 
under the ROC curve was 0.89, p = 0.01 6, supporting the ability of 
DBP to discriminate clinical sensitivity and resistance. There was 
variability in the quality of the tracings obtained, likely due to vari- 
ability in the viability of the thawed patient samples. When we 
applied criteria only to accept tracings for which there was at 
least a difference of 100 relative fluorescent units between our 
positive control (FCCP) and negative control, we observed similar 
results, with an AUC of 0.88. This came at the cost of excluding 
seven samples from analysis based on the criteria (Figure S6). 
Basically, every newly diagnosed patient with CML will be started 
on imatinib or another tyrosine kinase inhibitor, and nearly all will 
have at least a complete hematologic remission. Thus, there is lit- 
tle need for a new predictive biomarker to guide administration of 
tyrosine kinase inhibitors in CML. Nonetheless, this study demon- 
strates the principal that DBP can distinguish clinical sensitivity 
and resistance to a targeted agent. 



Figure 6. DBP Can Identify the Best In Vivo Treatment among 
Several Options 

Braf/Nf1 mutant melanoma cells were treated ex vivo with PLX4720 1 |.iM, 
PD0325901 (referred to as PD-901) 0.25 nM, GDC-0941 1 jiM, rapamycin 
0.1 |.iM, PD-901 + rapamycin, and PLX4720 + rapamycin. 

(A) DBP (16 hr incubation) results expressed as A% priming. Values indicate 
mean values ± SEM, at least three independent experiments were performed 
(n > 3). 

(B) In vivo response for this Braf/Nf1 mutant ailograft melanoma model 
(adapted from Figure 4C Maertens et ai., 2013) expressed as change in tumor 
volume (log2) after 7 days of treatment. 

(C) Correlation between A% priming and change in tumor volume. 



DBP Predicts Carboplatin Response in Ovarian 
Cancer Patients 

Although our testing was focused on using DBP with targeted 
agents, pro-death signaling resulting from treatment with clas- 
sical cytotoxic chemotherapies should be predictive of cellular 
response since these drugs also largely kill via the mitochondrial 
apoptotic pathway. We obtained 16 primary ovarian adenocarci- 
nomas from surgical resection. We treated a single cell suspen- 
sion of these tumors with carboplatin, the standard front-line 
therapy, ex vivo for 16 hr and performed DBP. We detected a 
robust A% priming (>20%) in six of the patient specimens. 

All analyzed patients were then treated with carboplatin in 
combination with taxol in the clinic. We then collected and 
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Figure 7. DBF Can Stratify In Vivo Drug Response to Imatinib in a 
Cohort of CML Patients and to Carboplatin in Ovarian Adenocarci- 
noma Patients 

(A) 24 frozen Ficoll purified bone marrow primary CML samples were treated 
for 16 hr with imatinib 1 and 5 |uM, and DBP was then performed. Results are 
expressed as A% priming. Values indicate mean ± SEM. Unpaired t-test, two- 
tailed, * p < 0.05. 

(B) A ROC curve analysis for this set of samples. The AUC is 0.89. 

(C) 16 ovarian adenocarcinoma patient samples were analyzed by DBP with 
carboplatin. We treated the samples for 1 6 hr with carboplatin 1 00 |.rg/ml, and 
DBP was then performed. Shown is a Kaplan -Meier plot of the patients’ 
progression-free survival in response to carboplatin and taxol. A significant 
difference was observed between those patients whose samples showed a A 
% priming >20% from those that were <20%, as assessed by Mantel-Cox 
statistical analysis. 

See also Figures S5 and S6, and Table SI . 



analyzed the clinical data on the patients to assess progression 
free survival using an abnormal and rising CA-125 as an index for 
progression. Patients with ovarian adenocarcinomas that ex- 
hibited a robust A% priming (>20%) experienced a significantly 



DISCUSSION 

Here, we tested and supported the hypothesis that the initiation 
of death signaling is sufficient to determine eventual commit- 
ment to cell death. By detecting early death signaling, DBP 
can predict in vitro and in vivo cytotoxic response in varied can- 
cers to varied classes of chemotherapeutic agents, agents 
which have in common only their ability to kill cancer cells via 
the mitochondrial pathway of apoptosis. While this provides 
basic mechanistic information about the events between drug 
treatment and commitment to cell death, we anticipate that its 
greatest utility might be in prediction of cancer patients’ 
response to therapy in the clinic. Over the past decade, an 
ever-growing number of therapies have been approved for use 
in medical oncology. But every tumor is distinct, with its own 
particular signaling network and pathway addiction yielding a 
distinct pattern of sensitivity to cancer therapeutics. The task 
of precision cancer medicine is to match a tumor to those agents 
that will most effectively eliminate it (Garraway and Janne, 2012; 
Sawyers, 2008; Yaffe, 2013). 

An analogous problem was faced in the previous century in the 
world of clinical microbiology. As the number of antibiotics prolif- 
erated, it became more challenging to identify the best drug for a 
particular isolate of bacteria. The very practical solution that 
emerged was to simply grow a lawn of bacteria and expose 
the isolate to all available antibiotics in the form of drug-soaked 
disks. Antibiotics were then chosen from those that caused the 
greatest elimination of bacteria. This practice is still the standard 
and has not been displaced by any modern technology, in- 
cluding genomics, proteomics or systems biology. While this 
method reveals little about signaling pathways and genetics of 
bacteria, it is supremely useful because it functionally summa- 
rizes the contribution of many genes and pathways to the pheno- 
type that is most pertinent, the response of the viable bacterium 
to antibiotics. A version of this assay has been the mainstay of 
clinical microbiology for many decades. 

Analogous ex vivo approaches have been attempted in 
oncology but with little success. A typical strategy was to expose 
a patient’s tumor to drugs and place it into ex vivo culture for 3- 
14 days followed by evaluation of cell death, proliferation, or col- 
ony formation (Burstein et al., 2011). The biggest difficulty was 
the requirement for ex vivo culture of cancer cells. Many cancer 
cells simply rapidly die in ex vivo culture. Those that survive can 
undergo arrest or other phenotypic changes that accompany the 
transfer from a comfortable in vivo niche to an ex vivo plastic dish 
in 21 % oxygen. In addition, if the culture is prolonged, there can 
be selection for non-tumor cells or clones that are poorly repre- 
sentative of the patient’s tumor. The result, in any case, was a se- 
ries of studies that did not provide sufficient predictive power to 
be clinically useful. Exciting new ex vivo cell culture strategies 
using more modern techniques require weeks to months (Crystal 
et al., 2014). Their utility in guiding patient care will doubtless be 
tested in the coming years. Patient-derived xenograft (PDX) 
mouse models are being tested as a newer venue for functional 
assessment of tumor cell response to therapy (Hidalgo et al.. 
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201 4). However, the time (months) and expense that are required 
to establish PDX models may limit their utility in clinical medicine. 

Here, we have taken a different approach. Appreciating the 
tremendous advantages of perturbing the actual patient tumor 
cell with the actual therapy of interest, we instead have prioritized 
making observations early enough that long term ex vivo culture 
is not needed. While we have found that death signaling can be 
detected as early as 4 hr after treatment, depending on the drug, 
we have found that a 16 hr incubation is sufficient for most 
agents to produce measurable death signaling in responding 
cells. 

We demonstrated that DBP can be exploited to select among 
many therapies the one that is best for a single tumor (Figure 6). 
We also demonstrated that DBP can select among many pa- 
tients those that are most likely to respond to a single therapy 
(Figure 7). These are the two major functions of a clinically useful 
predictive biomarker, and it is notable that DBP can perform 
them both. Of equal importance, the clinical and in vivo experi- 
ments of Figure 7 demonstrate that useful predictive obser- 
vations of both liquid (CML) and solid (ovarian) primary human 
tumors is consistent with a simple 16 hr monolayer culture. 

We anticipate that DBP may be used to make personalized 
choices of therapy for patients. One could use DBP to choose 
agents among a panel of candidate drugs for one individual pa- 
tient. Alternatively, one could use DBP to stratify a panel of pa- 
tients to identify those most likely to respond to an individual 
drug. In the case of drugs that have activity only in a subset of 
a particular disease, we believe DBP can more efficiently stratify 
patient selection for clinical trials or clinical use by prospectively 
identifying those whose tumors are most likely to respond. In 
addition, while our focus here was on cancer cells, it is important 
to realize that this approach is also applicable to the study of 
non-malignant cells. As such, it can be used as a probe of sen- 
sitivities of cells in normal biology to a variety of insults, or as a 
toxicology tool to predict the toxicity of novel agents to normal 
tissues. 

While we have focused mainly on single agent therapies in our 
proofs of principle studies, a strength of this approach is that it 
should work for both single agent and combination therapies. 
In fact, we explicitly demonstrated this in Figures 1 , 2, and S3. 
Given the nearly universal emergence of resistance to single 
agent targeted therapies, even when there is an excellent initial 
response, strategies for the rational choice of personalized com- 
bination therapies is of great importance. We can envision two 
ways DBP could be used to fashion such strategies. One is to 
simply expose tumor cells to the combinations as we did in Fig- 
ures 1 , 2, and S3. Another is to test a panel of single agents via 
DBP and combine two or more with good single agent activity. 

A tremendous amount of information has been collected on 
cancer cells in the past few years, and the amount is likely to 
continue to grow exponentially. Much of this information is 
now genetic, with whole cancer genomes being sequenced (Bar- 
retina et al., 201 2). In addition, there are technologies that garner 
an abundance of gene expression information and those that 
capture protein expression (Kornblau et al., 2009). It remains to 
be seen how widely these technologies will be useful in better as- 
signing therapy to patients. However, despite the huge amounts 
of information acquired, one common limitation of these studies 



is that they all represent static observations of dead cells. That 
means that a tremendous amount of the functional complexity 
of the cell has been lost to study. With DBP, we anticipate that 
a small number of strategic perturbations (drug and peptide ex- 
posures) on viable cells will yield vastly fewer bits of information, 
but that a great proportion of the bits will be clinically actionable. 

EXPERIMENTAL PROCEDURES 

Cell Lines and Treatments 

RPMl 1 640 media supplemented with 1 0% heat inactivated fetal bovine serum 
(GIBCO) 10 mM L-Glutamine and 100 U/ml penicillin and 100 jig/ml strepto- 
mycin was used for the culture of the cell lines used. The cells were cultured 
at 37°C in a humidified atmosphere of 5% CO 2 . 

Isolation and Treatment of Primary CML Cells 

Thirty primary CML samples from bone marrow biopsies viably frozen in 90% 
FBS/10%DMSO were obtained from the Pasquarello Tissue Bank at Dana- 
Farber Cancer Institute and from Dr. Philip C. Amrein at the Massachusetts 
General Hospital. Cells were thawed and resuspended in complete RPMl me- 
dia and washed with fresh media, counted by trypan blue exclusion and plated 
in a 12-well plate, 1 million cells/well, and treated with imatinib 1 and 5 ).iM. 
DBP failed on five samples due to failure of mitochondria to maintain trans- 
membrane polarization and 1 sample analysis was discarded for not having 
complete clinical information. After a 16 hr incubation at 37°C in a humidified 
atmosphere of 5% CO 2 , Dynamic BH3 Profile analysis was performed. Clinical 
response data were compiled by clinicians; patients are considered re- 
sponders when complete hematologic response was observed. 

Ovarian Primary Tumors 

Fresh primary tumors obtained from routine resections after patients signed an 
informed consent approved by the Institutional Review Board (DFCI#02-051), 
were used for preparation of viable single-cell suspensions. Tumors were first 
mechanically dissociated and digested for 1 hr at 37°C in 1 mg/ml collage- 
nase/dispase (Roche Diagnostics). Cells were then filtered through a cell 
strainer and cell viability was assessed by trypan blue exclusion. Cells were 
then frozen in freezing buffer (fetal bovine serum with 10% DMSO). For DBP, 
cells were thawed and resuspended in complete RPMl media with 100 U/ml 
of DNase I and incubated 15 min at room temperature. Then the cells were 
washed with fresh media, counted by trypan blue exclusion, and plated in a 
12-well plate, 0.2-0. 5 M cells/well and treated with carboplatin 100 iig/ml. 
After a 16 hr incubation at 37°C in a humidified atmosphere of 5% CO 2 . 
Dynamic BH3 Profile analysis was performed blinded to clinical outcome. Clin- 
ical response data were compiled by clinicians 6-24 months after sample 
acquisition. 

Dynamic BH3 Profiling 

2x10"^ cells/well were normally used, but 4x10'* cells/well were used for pri- 
mary CML and AML. 15 )il of BIM BH3 peptide (final concentration of 0.03, 0.1, 
0.3, 1, and 3 i_lM) in T-EB (300 mM Trehalose, 10 mM HEPES-KOH [pH 7.7], 
80 mM KCI, 1 mM EGTA, 1 mM EDTA, 0.1% BSA, 5 mM succinate) were 
deposited per well in a black 384-well plate (BD Falcon no. 353285). Single- 
cell suspensions were washed in T-EB before being resuspended at 4x their 
final density. One volume of the 4x cell suspension was added to one volume 
of a 4 X dye solution containing 4 ).iM JC-1 , 40 |.ig/ml oligomycin, 0.02% digi- 
tonin, 20 mM 2-mercaptoethanol in T-EB. This 2 x cell/dye solution stood at RT 
for 1 0 min to allow permeabilization and dye equilibration. A total of 1 5 |.il of the 
2 X cell/dye mix was then added to each treatment well of the plate, shaken for 
15 s inside the reader, and the fluorescence at 590 nm monitored every 5 min 
at RT. Percentage loss of for the peptides is calculated by normalization to 
the solvent only control DMSO (0% depolarization) and the positive control 
FCCP (100% depolarization), individual DBP analysis were performed using 
triplicates for DMSO, FCCP, and the different BIM BH3 concentrations used, 
and the expressed values stand for the average of three different readings. 
In cases were SD was >10%, the outlying reading was discarded. % priming 
stands for the maximum % depolarization obtained from the different BIM BH3 



Cell 160 , 977-989, February 26, 2015 ©2015 Elsevier Inc. 987 




Cell 



concentrations tested; typically 0.03, 0.1, 0.3, 1, and 3 |.iM. A% priming 
stands for the difference between treated cells minus non-treated cells 
(% priming*™***'’'^ - % priming™'’ *™^*'’''). See also Figure SI . 

Cell Viability Assays 

Cells were stained with fluorescent conjugates of Annexin-V (BioVision) and/or 
propidium iodide (PI) and analyzed on a FACS Canto machine (BD). Viable cells 
are annexin-V negative and PI negative, and cell death is expressed as 1 00% - 
viable cells. A% cell death stands for the difference between treated cells 
minus non-treated cells (% cell death^''®^*®^^ - % cell death"®^"^^'"®®^®'^). 

immunoblotting 

Total cell lysates were prepared in 1% Chaps buffer (5 mM MgCI2, 137 mM 
NaCI, 1 mM EDTA, 1 mM EGTA, 1 % Chaps, 20 mM Tris-HCI [pH 6.5], and pro- 
tease inhibitors [Complete, Roche]). Cells were washed twice, resuspended 
with 50-100 1 -lI of CHAPS lysis buffer, and kept on ice for 30 min. Then, the 
cellular suspension was centrifuged at 1 6,1 00 g for 5 min, and the supernatant 
used to perform the immunoblotting analysis. 

Twenty micrograms of protein were loaded on NuPAGE 10% Bis-Tris 
polyacrylamide gels (Invitrogen). The following antibodies were used to detect 
proteins on the membrane (dilution 1:1,000): Actin (Chemicon, MAB1501), 
PARP-1 (cell signaling, #9542), BCL-2 (Epitomics, #1017-1), BIM (Cell 
Signaling, #2933), and BCL-xL (Cell signaling, #2762). 

Statistical Analysis 

Statistical significance of the results was analyzed using Student’s t-tail test 
using GraphPad Prism 5.0 software. *p < 0.05 and **p < 0.01 were considered 
significant. SEM stands for Standard Error of the Mean. For ROC curve anal- 
ysis cell lines were considered responsive to treatment when A% cell death 
>10%: CML clinical samples when the patient achieved a complete hemato- 
logic response after treatment; for ovarian adenocarcinoma biopsies, clinical 
response data were compiled by clinicians 6-24 months after sample 
acquisition. 

SUPPLEMENTAL INFORMATION 
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In Brief 

Although HIV latency is currently thought 
to arise when an infected cell transitions 
from an activated to a resting state that is 
non-permissive to viral expression, a 
combination of modeling and synthetic 
control of HIV Tat positive feedback 
demonstrates that latency establishment 
operates autonomously from cell state. 



Highlights 

• HIV expression persists even when primary cells transition 
from activated to resting 

• Tat positive-feedback circuitry drives this autonomy from 
cell-state relaxation 

• Orthogonal activation of Tat shows that the circuitry suffices 
for autonomous latency 
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SUMMARY 

Biological circuits can be controlled by two general 
schemes: environmental sensing or autonomous 
programs. For viruses such as HIV, the prevailing hy- 
pothesis is that latent infection is controlled by cellular 
state (i.e., environment), with latency simply an epi- 
phenomenon of infected cells transitioning from an 
activated to resting state. However, we find that HIV 
expression persists despite the activated-to-resting 
cellular transition. Mathematical modeling indicates 
that HIV’s Tat positive-feedback circuitry enables 
this persistence and strongly controls latency. To 
overcome the inherent crosstalk between viral cir- 
cuitry and cellular activation and to directly test this 
hypothesis, we synthetically decouple viral depen- 
dence on cellular environment from viral transcription. 
These circuits enable control of viral transcription 
without cellular activation and show that Tat feedback 
is sufficient to regulate latency independent of cellular 
activation. Overall, synthetic reconstruction demon- 
strates that a largely autonomous, viral-encoded pro- 
gram underlies HIV latency — potentially explaining 
why cell-targeted latency-reversing agents exhibit 
incomplete penetrance. 

INTRODUCTION 

Diverse biological systems, both natural and engineered, face 
the challenge of surviving in variant and unpredictable environ- 
mental conditions. One strategy is to sense surrounding condi- 
tions and respond with environment-specific developmental 
programs— there is a 1 :1 correspondence between explicit sen- 
sor-actuators and the extremely reduced form of this scheme in 
which sensing and actuation are so tightly coupled that environ- 
ment entirely actuates the program (Bull and Vogt, 1979). An 
alternate strategy foregoes environmental sensing and actua- 
tion, instead relying on autonomous programs (Knedler, 1947), 
for example programs that intrinsically generate heterogeneity 
in phenotypes and allow probabilistic “bet hedging” (Cohen, 
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1966). For many systems, such as bacteriophage-X, it is unclear 
whether environmental sensor-actuator schemes or autono- 
mous programs are employed (Arkin et al., 1998; St-Pierre and 
Endy, 2008; Zeng et al., 2010). The ensuing debates carry evolu- 
tionary signifioance since sensor-actuator regulation can be 
driven by crosstalk from coincidental signals and hence tied 
to unrelated epiphenomena, whereas autonomous circuits are 
invariably subjected to direct natural selection pressures. In 
other words, if a phenotype is controlled by sensor-actuator 
regulation, it can be an “epiphenomenon,” but if autonomously 
regulated, the phenotype is invariably evolutionary hardwired 
and directly selected for. 

For HIV, the debate is clinically relevant; it remains unclear 
whether the host-cell environment or autonomous viral circuitry 
controls proviral latency, a long-lived viral dormancy state that 
is the chief barrier to curative therapy (Richman et al., 2009; Wein- 
berger and Weinberger, 2013). Upon infecting CD4'’' T lympho- 
cytes, HIV either actively replicates to rapidly produce progeny 
virions or can enter a long-lived quiescent state (proviral latency), 
from which it subsequently reactivates. These latently infected 
cells form a viral reservoir, forcing patients to remain on lifelong 
suppressive therapy. The prevailing view (Coffin and Swanstrom, 
2013; Siliciano and Greene, 2011) holds that proviral latency re- 
sults from HIV transcription being controlled by the host-cell 
activation state (i.e., environment) since relaxation of activated 
lymphocytes to a resting-memory state is correlated with 
increased epigenetic silencing of the HIV promoter and increased 
cytoplasmic sequestration of transcription factors that activate 
HIV transcription (Pearson et al., 2008; Tyagi et al., 2010). In 
this model, HIV infects activated T cells, which allow active viral 
replication, and if these cells “relax” to resting-memory T cells, 
which generally restrict HIV infection, viral latency ensues (Fig- 
ure 1 , left). 

In contrast to the cellular control hypothesis, there is circum- 
stantial evidence for an alternate model wherein latency is 
controlled by viral gene-regulatory circuitry (Ho et al., 2013; 
Jeeninga et al., 2008; Weinberger et al., 2005) without strict 
dependence on cellular state (Figure 1 , right). HIV encodes a 
transcriptional master circuit that is driven by the HIV Tat pro- 
tein, which amplifies expression from the viral promoter within 
the HIV long terminal repeat (LTR), establishing positive feed- 
back. Critically, minimal Tat positive-feedback circuits can 
recapitulate latency, and stochastic fluctuations between a 
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Figure 1. Two Models of HIV Latency Regu- 
lation: Cell-State Control versus Autono- 
mous Programming 

(A) (Left) The prevailing hypothesis of HIV proviral 

latency regulation. As T cells relax from 

an activated state (permissive to infection) to a 
resting-memory state, the host-cell environment 
silences HIV gene expression, restricting Tat 
transactivation of the LTR. (Right) The alternate 
hypothesis that HIV Tat positive feedback is robust 
to changes of the host-cell environment and op- 
erates autonomously despite changes in cell state. 
The overlapping nature of cellular and viral regula- 
tory circuits confounds testing between these hy- 
potheses (i.e., the LTR actuates Tat feedback but 
doubles as a sensor of the host-cell environment). 

(B) If cell state and viral circuitry can be orthogo- 
nalized (i.e., decoupled), the influence of cellular 
state on viral latency can be analyzed via an 
orthogonal 2D graphical correlation. (Left) If 
cellular state dominates regulation of viral latency, 
resting cells would inhibit viral circuitry while active 
cells would induce viral gene expression, gener- 
ating a strong correlation between cell state and 
viral activity. (Right) If an autonomous latency cir- 
cuit regulates latency, both latent and active viral 
expression could be generated in either resting 
cells or activated T cells, producing little correla- 
tion between cell state and viral activity. 



transcriptionally on and off state in the Tat circuit are sufficient 
to drive a phenotypic bifurcation between active and latent 
expression, even in non-resting cells (Weinberger et al., 2005). 
However, there is also evidence that cellular factors modulate 
stochastic HIV expression to drive latency (Burnett et al., 
2009), confounding the hypothesis that latency is controlled 
by an autonomous viral circuit. 

Here, we test between the cell-state and autonomous-circuit 
hypotheses for latency establishment. If latency is regulated by 
host-cell state, viral expression should be tightly correlated 
with cell state, whereas if the latency circuit is hardwired to func- 
tion autonomously, then cellular state would be uncorrelated 
with viral expression and tuning viral circuitry, independent of 
cell state, would be sufficient to control HIV latency (Figure 1 B). 
Surprisingly, we find that viral expression is robust to cellular- 
activation state in primary! cells, and mathematical models indi- 
cate that this autonomy results from intrinsic properties of the 
HIV Tat positive-feedback circuit. However, directly testing cir- 
cuit autonomy to cell state is confounded by overlap between 
cellular and viral networks— the same transcription factors that 
alter cellular activation also activate the HIV LTR, triggering Tat 
positive feedback (Karn, 2011). To circumvent this overlap, we 
synthetically reconstruct the Tat circuit to decouple viral depen- 
dence on the cellular environment from viral transcriptional regu- 
lation (i.e., decouple viral sensing and actuation). The refactored 
circuits chemically modulate viral expression independent of 
cellular activation levels and show that Tat circuitry is sufficient 
to overcome cell-driven silencing of HIV transcription during 
cellular relaxation from active to resting. Overall, the results 
argue that the Tat circuit is hardwired to establish latency largely 
autonomous of cellular state. 



RESULTS 

Donor-Derived Primary T Lymphocytes Maintain Robust 
HIV Expression during Cellular Relaxation from 
Activated to Resting 

To test the prevailing “epiphenomenon” hypothesis of HIV 
latency establishment, we aCD3/CD28 pre-activated donor- 
derived primary human CD4'^ T lymphocytes (to achieve a 
CD25'^CD69'^ phenotype), infected them with full-length HIV-1 
virus, and then removed activation stimuli, allowing infected cells 
to relax to a resting (CD25“CD69“) state (Figure 2A). The virus 
used (HIV-d2GFP) encodes a short-lived 2-hr half-life GFP 
(d2GFP) reporter to enable rapid detection of viral transcriptional 
silencing and is env mutated (i.e., single-replication round) 
to avoid confounding the data with expansion of the infected 
cell population. Infected cells were sampled periodically over 
2 weeks for cellular activation status (as quantified by CD25 
and CD69) alongside viral-GFP expression. 

Surprisingly, viral expression appears remarkably robust during 
the cellular transition from activated to resting (Figures 2B and 
2C-2H). Despite drastic decline in cellular activation both in 
CD25 (Figures 2D and 2G) and in CD69 (Figures 2C and 2F), viral 
activity (quantified by GFP expression of productively infected 
cells) remained relatively unchanged (Figures 2B, 2E, and 2H). 
The resilience of viral gene expression despite cellular relaxation 
is not due to differential relaxation of productively infected cells 
compared to the overall population, as productively infected cells 
relax at the same rate as the overall population (Figure SI). 

Since human primary cells represent a mixed co-culture (i.e., 
infected and uninfected subsets of cells), which may obfuscate 
the interpretation of results (Jordan et al., 2003), we also 
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Figure 2. HIV Expression Is Autonomous to Changes in Cellular State: Transitioning of Primary T Lymphocytes from Activated to Resting 
Does Not Silence HIV Expression 

(A) Schematic of activation, infection, and long-term observations of relaxing primary CD4'^T cells with full-length HIV-d2GFP. Donor-derived primary cells were 
activated with aCD3/CD28 beads in the presence of rlL-2 for 3 days, following which beads were removed and the cells were infected. At indicated time points, 
cells were collected for flow-cytometry-based measurement of CD25/CD69 levels and GFP expression. Data shown (in B-E) are representative of duplicate 
infections performed with cells from two donors. 



(legend continued on next page) 
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performed a refined version of the experiment by isoiating HiV- 
infected ceiis through FACS sorting and tracking this purified 
popuiation of infected iymphocytes as ceiis reiaxed to resting 
(Figure 21). As before, even after 2 weeks of cuiture, ~90% of 
ceils maintain high-level viral expression (Figure 2J) despite 
cellular relaxation to resting (Figure S1). Collectively, these two 
experiments show that, despite a 10-fold decline in CD4* T cell 
activation levels, the impact on viral gene expression is minimal, 
suggesting that viral circuitry is largely autonomous to cellular 
state. 

Computational Analysis Predicts that Tat Feedback 
Circuitry Can Autonomously Generate Active and Latent 
Infection across a Broad Range of Cellular-Activation 
States 

To investigate how viral transcription remains robust despite cell- 
state changes, we employ a simplified computational model of 
HIV transcriptional regulation (Figure 3A) based on previous 
studies (Weinberger et al., 2008). This model builds off the stan- 
dard two-state model of transcription (Kepler and Elston, 2001 ; 
Paulsson, 2004) and allows the LTR promoter to stochastically 
toggle between a transcriptionally non-permissive state (LTRqff) 
and a transcriptionally permissive state (LTRqn) at rates koff and 
kon, respectively. In the LTRqn state. Tat protein can transactivate 
the promoter, enhancing transcriptional elongation at a rate 
ktransact- These parameters (kam kon, and ktransact! have been quan- 
tified by single-cell analysis (Dar et al., 2012; Singh et al., 2010; 
Weinberger et al., 2008), and measurements at thousands of 
HIV integration sites across the human genome show kon to be 
the predominant parameter that alters LTR activity in the regime 
required for latency (Dar et al., 2012), i.e., the weak expression 
regime. Potent cell-state activators, such as tumor necrosis fac- 
tor a (TNFot), which acts through the same pathway as aCD3/ 
CD28 activation, maximally stimulate LTR activity by increasing 
kon by 1 .5- to 2-fold (Dar et al., 2012, 2014; Jordan et al., 2001). 

To determine whether relaxation of activated T cells (i.e., de- 
creases in kon) can drive LTR-Tat circuit shutoff and latency, 
we simulated infection of activated T cells and examined how 
tuning kon alters the fraction of trajectories in the ON state; 
i.e., initial conditions were LTRqn = 1. and all other molecular 
species = 0 (see Table SI), thereby allowing efficient Tat turn- 
on in activated cells with subsequent stochastic circuit shutoff. 
The simplified model recapitulates previous results showing a 
phenotypic bifurcation in Tat levels (Weinberger et al., 2005), 
with a fraction of trajectories remaining ON and a fraction turning 
OFF (Figure 3B) for any given kon across a broad range of values 



(Figure 30). Indeed, for LTR activities within three orders of 
magnitude (Figure S2), any trajectory can maintain either an 
ON or OFF state purely by altering the level of Tat without a 
change in basal LTR activity. Thus, the model predicts that, at 
a given cellular-activation state {kon value), circuit activity could 
be toggled ON and OFF simply by supplying Tat alone (e.g., in 
trans) without activating the LTR or changing the cellular-activa- 
tion state (e.g., via TNFa). Moreover, the ON fraction can also be 
altered by changing Tat abundance— and hence feedback 
strength— through Tat half-life modulation (Figure S2). 

Next, we directly examined how decreases in kon influenced 
circuit activity. For all 2-fold decreases in kon (over three orders 
of magnitude), there is >90% robustness in the percentage of tra- 
jectories in the ON state (Figure 3D). 2-fold decreases in LTR ac- 
tivity were examined because removal of cell-state activators 
(e.g., TNFot), result in 1 .5- to 2-fold reductions in LTR activity 
(Dar et al., 2012; Jordan et al., 2001), but comparable circuit 
robustness was observed for all 4-fold and even 1-Log 
reductions in kon (Figure S2). In fact, the simplified nature of the 
computational model allows derivation of an analytical “closed- 
form” solution for the fraction of ON trajectories as a function 
of time for all parameters (see Extended Experimental Proce- 
dures), thereby enabling phase-plane analysis of the ON fraction 
as a function of kon and ktransact (Figure S2). This phase-plane 
sensitivity analysis demonstrates that— throughout the physio- 
logical parameter regime of ktransact > kon (Dar et al., 2012; Molle 
et al., 2007)— even if an infected cell lives far longer than the 
in vivo lifetime of 40 hr (Perelson et al., 1996), kon modulation 
cannot substantially alter the ON fraction. To be completely 
sure that these results were not a peculiarity of the specific model 
used, we also examined an alternate positive-feedback model 
topology (Weinberger et al., 2005)— which encodes substantially 
more molecular detail but is experimentally validated— and we 
observed similar circuit robustness to decreases in kon (Fig- 
ure S2). Analytical solution shows that this robustness results 
from the strong positive feedback {ktransact > kon), since changes 
in kon produce small corrections. Notably, despite the circuit’s 
robustness to cellular relaxation {kon decreases), high values of 
kon do generate less-frequent latency in both the simplified model 
(Figure 3C) and the complex models (Weinberger et al., 2005). In 
fact, the analytical solution quantifies how increases in kon (e-9-. 
via NFkB stimulation) reactivate the circuit from a latent state 
(Equation 12, Extended Experimental Procedures). 

Overall, the results demonstrate robustness of LTR-Tat circuit 
activity to cellular relaxation (i.e., reductions in kon), consistent 
with primary cell observations (Figure 2), but, critically, also 



(B) Flow cytometry time course of CD25 and GFP levels taken on indicated days post infection. Dotted line indicates gating for productively infected cells (GFP'’’). 
(C-E) Histograms of cellular activation levels CD25 (C) and CD69 (D) of the entire population alongside GFP expression from productively infected cells (cells in 
GFP"’’ gate in B) over the course of 13 days post infection (17 days post cellular activation). 

(F-H) Cellular activation levels and GFP levels for all replicates over the experimental time course. Each dot indicates the time point from an independent infection 
and represents the geometric mean of the distribution as seen in C-E. Solid line connects the mean of the replicates. CD25 and CD69 normalized to day 0 
(maximal); GFP normalized to day 4 when viral activity is first observed. 

(I) Schematic of FACS-based isolation of productively infected cells. 4 days post infection, GFP"’’ cells were isolated and cultured (repeated for two donors). 

(J) Histograms of isolated GFP* cells over time. Numbers indicate the proportion of cells that fall within the gate for positive GFP expression (marked by horizontal 
black bar). Day 4: Gray histogram shows the infected population prior to FACS-based separation. Viral titer was calibrated to achieve 10% infection (fraction of 
gray histogram that is GFP* at day 4). Histogram in green (for days 4, 9, and 1 3) shows the GFP expression in the isolated productively infected cells (post sort). All 
data shown above are from donor 1 . 

See Figure SI for results from donor 2 and CD25 expression decline during the experiment. 
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Figure 3. Computational Analysis Predicts that Tat Positive-Feedback Circuitry Underlays HIV Autonomy to Cell State 

(A) Schematic of a simplified model of theTat-feedback circuit. The LTR promoter can toggle between a state where transcriptional elongation is stalled (LTRqff) 
and a state where elongation proceeds (LTRqn) at rates koff and kon, respectively, (Dar et al., 2012; Singh et al., 2010, 2012) and Tat protein transactivates 
the promoter by enhancing transcriptional elongation at a rate ktransact (Razooky and Weinberger, 2011). Tat protein and mRNA decay at rates 6^ and 6p, 
respectively. 

(B) Stochastic Monte-Carlo simulations ("Gillespie” algorithm) of Tat protein levels (in arbitrary number of molecules) in individual cells over time (from reaction 
scheme in A). Each trajectory represents an individual cell; 100 single-cell trajectories shown (initial conditions for all species equal zero at time t = 0, except 
LTRqn = 1); see Extended Experimental Procedures for reaction rates. 

(C) Bee-Swarm plots of circuit activity (Tat levels at t = 200) over a range of kon values. Each data point represents a single-cell trajectory, (200 trajectories shown 
per kon value). The width of the collection of cells (dots) having zero level of Tat (bottom of each kon value simulated) shows that high values oikon do generate less 
frequent latency (smaller number of dots). Compare, for example, the spread of red dots {kon = 10“^) and black dots {kon = 10“^) at 0. 

(legend continued on next page) 
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show sensitivity of latency to changes in Tat abundance or 
changes in Tat half-life. Below, we experimentally test these 
computational predictions: (1) that LTR-Tat circuit activity be- 
tween latent and active can be toggled by Tat levels alone (i.e., 
independent of cellular-activation state), (2) that Tat is more 
effective at activation from latency than cell-state modifiers, 
and (3) thaf cellular relaxation to resting does not silence Tat pos- 
itive-feedback circuitry. 

A Minimal Synthetic Circuit Shows that Viral 
Reactivation from Latency Can Be Toggled 
Independent of Cellular Activation 

Totest whether HIV gene-regulatory circuitry can control proviral 
latency without changes in cellular-activation state, we devel- 
oped synthetic circuits in which viral expression could be 
toggled independent of cell state. The synthetic circuits are 
based upon a minimal model of the HIV latency circuit and 
encode a transcriptional positive-feedback loop in which HIV 
Tat amplifies expression from the HIV LTR promoter (Jordan 
et al., 2001 ; Weinberger et al., 2005). The minimal LTR-Tat circuit 
is sufficient to recapitulate latent gene expression; stimulation 
with cell-state modifiers reactivates proviral expression from a 
non-expressive “OFF” state to a high-level “ON” state. 

The minimalist synthetic toggle circuit encodes Tat fused to a 
controllable proteolysis tag, FKBP (Banaszynski et al., 2006), un- 
der the control of the HIV LTR (Figure 4A). FKBP degradation is 
reversibly inhibited by a small molecule, Shield-1, allowing Tat 
half-life to be rapidly tuned. The Tat-FKBP fusion was also 
tagged with a photo-switchable fluorescent protein, Dendra-2 
(Gurskayaet al., 2006), which allows for light-based pulse-chase 
experiments (Zhang et al., 2007) to measure Tat half-life destabi- 
lization in single cells (Figure S3). In this minimal LTR-Tat- 
Dendra-FKBP viral vector. Tat half-life is reduced to 2.5 hr in 
the absence of Shield-1 (a ~3. 3-fold reduction from its native 
half-life) but returns to its native 8 hr half-life (Weinberger and 
Shenk, 2007) in the presence of 1 pM Shield-1 . 

Simulafions predict that changes in Tat half-life should be suf- 
ficient to toggle HIV positive feedback between ON and OFF at a 
majority of viral integration sites (Figure S2). As predicted, 
altering the Tat half-life by addition or removal of Shield-1 was 
sufficient to toggle between latent and active expression across 
an array of integration sites (Figure 4B). The observed reactiva- 
tion is not due to pleiotropic effects of Shield-1 since Taf-Dendra 
fusion proteins lacking FKBP are insensitive to Shield-1 (Fig- 
ure S3). Moreover, the increased expression levels cannot 
simply be due to an increase in the half-life of the reporter (Den- 
dra-2), as the expression increases are substantially greater than 
the 3.3-fold increase in half-life caused by Shield-1 (Figure S3). 
To be completely sure that reporter half-life changes were not ac- 
counting for the increased expression, we also decoupled the 
fluorescent reporter half-life from the Tat half-life by creating a 
polycistronic system in which the reporter protein and Tat are 
transcriptionally fused, but not translationally fused (Figure S4). 



The polycistronic system corroborates the finding thaf Tat posi- 
tive feedback is sufficient to control viral switching from an inex- 
pressive OFF to expressive ON state (Figure S4). Thus, in both the 
translational and transcriptional fusions, Shield-1 toggles the cir- 
cuit between ON and OFF. These data indicate that tuning Tat 
positive feedback is sufficient to toggle HIV gene expression be- 
tween a quiescent state and an actively expressing state and that 
viral expression can be activated without activating cell state. 

Tat Induction Alone Is More Efficient Than Cell-State 
Activation for Reactivating Latent Clones 

One caveat of using tunable proteolysis systems to toggle the 
Tat circuit is that a minimal level of Tat protein must be present 
in the off state— i.e., modulating protein half-life when protein 
concentration is zero has no effect. Thus, the Tat-FKBP 
approach is unable to test whether Tat can reactivate latent cells 
that are fully silenced. To circumvent this obstacle and test 
whether Tat induction is sufficient to reactivate completely 
silenced LTRs, we developed a set of open-loop circuits, based 
on the T et-On system (Gossen and Bujard , 1 992), that induce Tat 
expression de novo. These systems allow tight induction of Tat 
expression upon Doxycycline (Dox) addition. To examine the ef- 
fects of Tat induction on HIV gene expression, these circuits 
were incorporated into cells that encoded an HIV LTR promoter 
driving the mCherry fluorescent reporter (Figure 4C), and a library 
containing 33 distinct LTR clonal integration sites was examined. 

The T et-On circuits show that T at by itself is sufficient to toggle 
cells between OFF and ON and to control the mean levels of LTR 
expression despite the large clonal variation (Figure 4D). Impor- 
tantly, a number of clones (clones 1-3) exhibit no detectable LTR 
expression in the absence of Tat induction— the conventional 
threshold for lafency. But, inducing Tat expression is sufficient 
to fully reactivate these clones without the need for any cell-state 
activation signals. 

Next, to test the effects of cell-state activation, Tet-inducible 
isoclonal populations were exposed to an array of standard 
cell-state modifiers. These agents are potent activators of T lym- 
phocytes (Pazin et al., 1996) and also of the LTR (Jordan et al., 
2001 ; Karn, 201 1). For example, TNFa strongly activates T cell 
state by stimulating nuclear localization of the nuclear factor of 
activated T cells (NFAT) and by stimulating recruitment of the 
p50-RelA heterodimer to promoters containing NF-KB-binding 
sites (Karin and Lin, 2002). If cell-state activation were the domi- 
nant factor controlling latency, then cell-state modulators should 
strongly reactivate latent mCherry expression in the Tet-induc- 
ible system. Strikingly, cell-state activation alone only slightly in- 
creases LTR expression and the percentage of cells in the ON 
state, across the library of 33 distinct integration sites (Figure 4E). 
In contrast, induction of Tat (by Dox) drastically increases the 
percentage of cells in the ON state to near 100% (Figure 4E). 
This dramatic difference between direct Tat induction versus 
cell-state modifiers demonstrates that ktransact > kon for the HIV 
circuitry and indicates that Tat-mediated transactivation is far 



(D) Fold change in percentage of trajectories in ON state for 2-fold reductions in kon- Circuit activity (%ON) is largely robust to reductions in LTR activity {i.e., kon) 
over three orders of magnitude. Phase-plane analysis (i.e., sensitivity analysis) from a closed-form analytical solution shows that this behavior is robust across the 
physiological parameter regime {ktransact > f<on)- 
See also Figure S2 and Table SI . 
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stronger an effect than the switching of the LTR to an ON state 
through cell-state modifications. Collectively, these data (Fig- 
ure 4E) indicate that activating cell-state alone is not sufficient 
to control HIV transcription. These results in no way exclude a 
role for cellular state in HIV reactivation in vivo. Rather, the suffi- 
ciency of Tat-mediated viral reactivation without cell-state modi- 
fication emphasizes the autonomy of the HIV Tat circuit. 

Refactoring of Full-Length Replicating HIV Indicates 
that Latency Establishment and Reactivation Depend on 
Viral-Circuit Activity and Are Largely Independent of 
Cellular Activation 

We next tested whether viral circuitry could control latency in full- 
length replicating virus. First, we developed a decoupled system in 



Figure 4. Synthetic Tuning of Tat Circuit Ac- 
tivity Is Sufficient to Control Latent HIV 
Expression in the Absence of Cellular Acti- 
vation 

(A) Schematic of the minimal LTR-Tat-Dendra- 
FKBP lentiviral circuit. In the absence of Shield-1 , 
the Tat-Dendra-FKBP fusion protein is rapidly 
degraded, diminishing positive feedback. When 
Shield-1 is added, FKBP-mediated proteolysis is 
blocked, allowing Tat levels to increase and 
enabling strong Tat positive feedback. 

(B) Flow cytometry histograms of eight isoclonal 
populations of Jurkat cells infected with LTR-Tat- 
Dendra-FKBP in the absence of Shield-1 (light 
gray histograms) or the presence of 1 [.iM Shield-1 
(dark gray histograms). Gating of the Dendra- 
positive region (right of black-dashed line) was set 
relative to naive, un-transduced Jurkat cells. See 
also Figures S3 and S4. 

(C) Schematic of the synthetic system (left) and 
flow cytometry data of the LTR expression in cells 
transduced with the synthetic circuit (right). The 
synthetic circuit is composed of an rTta activator 
constitutively expressed from an SFFV promoter. 
In the presence of Dox, rTta protein activates 
the Tet-On promoter to drive expression of the 
Tat-Dendra fusion protein. Tat transactivates 
expression from the HIV-1 LTR promoter, and LTR 
activity is measured by mCherry expression. 

(D) LTR mCherry expression is shown for 1 1 
representative isoclonal populations in the 
absence of Dox (light gray histograms) or after 
Dox addition (dark gray histograms). 

(E) Flow cytometry analysis of a library contain- 
ing 33 distinct LTR clonal integration sites sub- 
jected to Dox and a panel of standard cell- 
state modifiers: TNFa, phorbol myristate acetate 
(PMA), PMA-ionomycin, suberanilohydroxamic 
acid (SAHA/vorinostat), trichostatin A (TSA), or 
prostratin. 

Error bars show SD. 



which Tat expression is controiied by the 
ceils (via Tet-On) compieteiy indepen- 
dently of the virus. The engineered cells, 
termed “inducibleTat cells,” containasta- 
ble integrated Tet-inducible Tat-Dendra 
cassette and provide in trans complementation of Tat for a reengi- 
neered Tat-deleted full-length virus, the ATat-Cherry virus. The 
ATat-Cherry virus was constructed from a full-length HIV molecu- 
lar clone containing aTat deletion (Huang et al., 1 994) and encodes 
an mCherry fluorescent reporter within nef (Figure 5A). In these 
inducible Tat cells, viral gene expression can be toggled on even 
if initial Tat levels are zero and virus replicates only in the presence 
of Dox and, as with conventional strains, virus is inhibited by HIV 
protease inhibitors (Figure S5). Inducing Tat expression in these 
cells during infection with ATat-Cherry virus shows a ~400% in- 
crease in active infection compared to non-induced ATat 
Cherry-infected cells (Figure 5B), indicating that absence of Dox 
drives the virus to enter latency in agreement with findings that 
Tat protein can inhibit establishment of latency (Donahue et al.. 
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Figure 5. Tat Feedback Circuitry Is Suffi- 
dent to Control Active-versus-Latent Infec- 
tion in Full-Length Viruses 

(A) Schematic of experiment: A Jurkat cell line in 
which Tat-Dendra is expressed only in the pres- 
ence of Dox, “inducible Tat cells,” was infected 
with full-length ATat-Cherry virus in the presence 
{+) or absence (— ) of Dox to score for latency and 
to score reactivation. Dox— infections were sub- 
sequently induced by Dox. 

(B) Percent of cells actively infected (actively ex- 
pressing mCherry) 2 days post infection. 30% of 
cells were actively infected in the presence of Dox 
(blue), while only 7% of cells were actively infected 
in the absence of Dox (red). Upon subsequent Dox 
incubation of the Dox- infection, 28% of cells 
reactivated to active infection (purple), indicating 
that virtually all latent cells can be reactivated with 
Tat induction. 

(C) Experiment schematic: CEM T cells were 
infected with either full-length Tat-FKBP virus 
or control virus in the presence or absence of 
Shield-1. 

(D) Percent of cells actively infected (actively 
expressing Dendra) 2 days post infection. For 
the control virus infection, 25.8% ± 1 .0% of cells 
exhibit active infection in the presence of 1 )iM 
Shield-1 (blue), while 26.0% ± 2.7% exhibit 
active infection in the absence of Shield-1 (red). 
For the Tat-FKBP virus infection, 1 7.5% ± 1 .7% 
of cells exhibit active infection in the presence 
of 1 1 -lM Shield-1 (blue), while 7.5% ± 1.0% of 
cells exhibit active infection in the absence 
of Shield-1 (red). Infections were performed in 
triplicate. Error bars = 1 SD. Control virus 
infection and Tat-FKBP virus infection are inde- 
pendent experiments (infection titers of the two 
are different). 

(E) Comparison of viral circuit versus cell-state 
activation by quantifying the percentage of delta- 
Tat virus infections that enter the active state. 
In the absence of TNFa or Dox, 2% of cells 
generate active HIV replication. Dox addition 
increases active infections to ~13%, while 

TNFa generates 4% actively infected cells. The same can be seen by plotting Tat expression level (Dendra). Again, TNFa by itself leaves expression 
level unchanged over that in absence of treatment. Addition of Dox leads to >2-fold increase in expression. 

Also see Figure S5 for the experiment repeated with Dox and a panel of cell-state modifiers. 




2012). Strikingly, subsequent induction of Tat expression by Dox 
fully reactivates latent virus to levels observed in the initial infection 
with Dox (Figure 5B). Further, Dox was far more effective in reacti- 
vating latent virus than any of the standard cell-state modifiers: 
TNFa, PMA, PMA-ionomycin, SAFtA/vorinostat, TSA, orprostratin 
(Figure S5). Flence, latent provirus can be reactivated by Tat induc- 
tion alone, without altering cellular-activation state, demonstrating 
that Tat is sufficient to control latent reactivation in full-length FI IV. 

Next, to check whether Tat induction in c/'s (i.e., within the 
positive-feedback loop) could also control latency in full-length 
virus, we reengineered the ATat-Cherry virus to encode either 
the Tat-Dendra-FKBP cassette, referred to as “Tat-FKBP virus” 
(Figure 5C), or a control Tat-Dendra cassette, referred to as “Tat- 
Dendra control virus,” or simply “control virus” (Figure S5). As 
previously established in these rref-reporter viruses, actively 
replicating infections express reporter, while latent infections 



are quantified by absence of reporter expression (Jordan et al., 
2003; Pearson et al., 2008). In control FIIV infections, Shield-1 
has no measureable effect on active-versus-latent infection (Fig- 
ure 5D). In striking contrast, in Tat-FKBP virus infections, modu- 
lating Tat positive-feedback strength with Shield-1 alters the 
percentage of actively infected cells by 141 %, i.e., >2-fold (Fig- 
ure 5D). The reduction in actively infected cells is not due to 
reduced input virus since equivalent titers of virus (i.e., MOIs) 
were used in the presence and absence of Shield-1 and the 
lack of measureable difference in infection in control FIIV infec- 
tions indicates that Shield-1 is not inducing abortive infections 
and that hypothetical pleiotropic effects of Shield-1 cannot 
explain the difference in active-versus-latent infection. Overall, 
these results show that modulating viral feedback strength is suf- 
ficient to control the establishment of active-versus-latent infec- 
tion in full-length replicating virus. 
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Figure 6. Tat Feedback Circuitry is Suffi- 
cient to Autonomously Regulate Viral 
Expression during the Activated-to-Resting 
Transition in Primary T Cells 

(A) Experiment schematic: Donor-derived primary 
004"^ T lymphocytes were activated and infected 
with LTR-Tat-Dendra-FKBP in either the presence 
of Shield-1 (blue, wild-type feedback) or without 
Shield (red, attenuated feedback), and cells were 
allowed to relax back to resting (as measured 
by CD25 surface expression) in the presence/ 
absence of Shield-1 (i.e., under wild-type/attenu- 
ated feedback). 

(B) Flow cytometry analysis of viral expression 
(Dendra fluorescence) in primary CD4'^ T lympho- 
cytes during transition from activated to resting in 
absence of Shield-1 (attenuated feedback; top) or 
presence of Shield-1 (wild-type feedback; bot- 
tom); activated are lymphocytes shown as opaque 
histograms, and resting lymphocytes are shown 
as translucent histograms. 

(C) Plot of the fold change in the number of active 
infections for varying cellular state (fold change cell 
activation as measured by CD25 surface expres- 
sion; see also Figure S6). If feedback strength is 
wild-type (blue data points; blue trend line), the fold 
change in viral activity is uncorrelated with 
changing cell state. In the presence of attenuated 
feedback, the percentage of active infections is 
dependent on cell state. Each data point is 
normalized against the percent of active infections 
in the lowest cell-state activation data point. 



Tat Induction Is >300% More Effective Than Cellular 
Activation for Reactivating Full-Length Latent HIV 

To directly compare the effects of tuning viral circuitry to altering 
cellular-activation state, inducible Tet-Tat-Dendra cells were in- 
fected with ATat virus in the presence of Dox or TNFot (Figure 5E). 
Modifying cellular activity with TNFot, in the absence of Tat induc- 
tion, leads to a 1 .5-fold change in the percentage of active infec- 
tions (from 2% to 4% active infection), whereas Tat induction 
drastically increases, by >300%, the proportion of infections 
that are active (Figure 5E). Similar results were seen in reactivat- 
ing latent cells post infection (Figure S5): inducible Tet-Tat-Den- 
dra cells were infected with ATat virus and 3 days post infection 
were treated with either Dox or standard cell-state modulators 
(as well as combination of the two). Tat induction through Dox 
was significantly more effective at reactivation than the cell-state 
modifiers. Thus, as seen with the minimal-synthetic circuits (Fig- 
ure 4), perturbing viral circuitry provides substantially more 
potent reactivation of latency than targeting cell state alone. 

Tat Circuitry Is Sufficient to Autonomously Regulate 
Viral Expression during the Activated-to-Resting 
Transition in Human Primary T Lymphocytes 

As a final test, we directly examined the model prediction that 
Tat circuitry alone is sufficient to explain the resilience of HIV 
transcription to cellular silencing during cellular relaxation from 
activated to resting (Figure 3D). Activated primary CD4'^ T cells 
were transduced with LTR-Tat-Dendra-FKBP virus and allowed 
to relax from an active to a resting-memory state while Tat pos- 



itive-feedback strength was either maintained or attenuated by 
removing Shield-1 (Figure 6A). 

When Tat positive feedback is attenuated (by absence of 
Shield-1) as lymphocytes relax from activated to memory, signif- 
icant silencing of HIV gene expression occurs (Figure 6B, red his- 
tograms). However, when Tat positive-feedback strength is 
maintained at wild-type levels (via Shield-1 addition), only a slight 
shift in HIV gene expression occurs as lymphocytes transition 
from active to memory (Figure 6B, blue histograms). Quantifying 
the relaxation of cellular activation alongside viral latency reveals 
a remarkable relationship: if Tat feedback is attenuated, the 
cellular-activation state tightly controls entry to latency by signif- 
icantly reducing the percentage of cells in active infection (Fig- 
ure 6C, red); however, when Tat feedback is active (the case in 
Figure 2), the cellular activation state has no bearing on entrance 
into latency as the percentage of cells in active infection remains 
constant (Figure 6C, blue)— i.e., the intact feedback circuit al- 
lows viral gene expression to act completely independent of 
cellular-activation state. Thus, active Tat feedback appears to 
buffer HIV from global transcriptional silencing as primary lym- 
phocytes transition from active to resting memory. 

DISCUSSION 

Beginning with observations that HIV gene expression is 
largely autonomous to cellular relaxation (Figure 2), computa- 
tionally guided synthetic reconstruction revealed Tat positive 
feedback as the core mechanism underlying viral autonomy 
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(Figures 3-5). Strikingly, Tat feedback alone is sufficient to 
overcome cell-driven silencing of HIV transcription during 
cellular relaxation from active to resting in primary T cells (Fig- 
ure 6). These findings are consistent with patient-cell latent-re- 
activation experiments showing that direct addition of Tat 
activates viral expression and reverses latency in resting 
CD4* T cells without requiring cellular activation (Lassen 
et al., 2006; Lin et al., 2003). Thus, in patient cells, Tat-medi- 
ated positive feedback also appears to regulate latency inde- 
pendent of cell state. 

The data herein cannot discount one variant of the cell-state 
hypothesis which proposes that latency is established when 
HIV infects relaxing cells which are at an activation level just 
above a first threshold required for HIV infection and integration 
but below a second threshold required to sustain active Tat 
expression and viral replication. However, there are difficulties 
with this hypothesis. While the presence of two thresholds is 
plausible, the second (Tat activation) threshold being higher 
than the first (infection) threshold is not consistent with existing 
data. For example, although global activation of primary CD4'^ 
T cells is required for efficient infection, HIV can be reactivated 
from latency in primary cells without globally activating the cells 
(Xing et al., 2012). Similarly, the reactivation of HIV in resting 
T cells using Tat protein (Lassen et al., 2006; Lin et al., 2003) in- 
dicates that extremely low levels of cellular activation (i.e., in 
resting/quiescent cells) are still amenable to robust viral ex- 
pression. Thus, since resting cells are at an activation level 
non-permissive to infection (Pan et al., 2013) but are sufficiently 
activated for Tat to function, the putative Tat-activation threshold 
is lower than the infection threshold and the two-threshold sce- 
nario appears unlikely. 

If cellular relaxation does not lead to the establishment of HIV 
latency, how is HIV latency established? Previous studies 
demonstrated the intrinsic ability of the Tat positive-feedback 
circuit to rapidly and stochastically establish latency (Wein- 
berger et al., 2005), consistent with recent primate studies 
showing that latency is rapidly established within the first 
3 days of infection (Whitney et al., 2014) and with cell-culture 
models showing latency establishment immediately upon infec- 
tion (Calvanese et al., 2013; Dahabieh et al., 2013). Given that 
resting CD4'^ T lymphocytes are highly resistant to direct HIV 
infection (Pan et al., 2013), the rapid establishment of latency 
is difficult to reconcile with the cell-state epiphenomenon the- 
ory; productively infected cells live <2 days in vivo (Perelson 
et al., 1997), while the process of T cell transitioning from active 
to memory is a slow and low-probability process (Youngblood 
et al., 2013) occurring during and after vigorous expansion of 
effector lymphocytes that only begins weeks after infection 
(Kuroda et al., 1999). The alternate model examined here (Fig- 
ure 3), wherein intrinsic (stochastic) viral circuitry autonomously 
regulates HIV latency, also provides a mechanistic basis for 
recent observations in patient cells (Ho et al., 2013), showing 
that: (1) a significant fraction of latent proviruses are not induced 
even if cells are reactivated from a resting-memory state, and 
(2) a second identical cellular stimulation (of already activated 
cells) induces additional latent proviruses to reactivate. These 
results indicate that viral reactivation is probabilistic. While 
particularly puzzling for the cellular-control hypothesis, probabi- 



listic reactivation is consistent with HIV latency being regulated 
by an autonomous viral-encoded circuit influenced by stochas- 
tic gene-expression fluctuations, which provides rationale for 
targeting viral gene-expression circuitry to reactivate latent 
HIV (Dar et al., 2014). 

To be completely clear, the viral-encoded latency model does 
not exclude a role for cellular state in regulating HIV proviral la- 
tency. In fact, the Tat-feedback model predicts that latency 
establishment is sharply reduced at higher cellular activation 
levels (Figure 3C) and that cellular activation probabilistically 
reactivates latent virus (Equation 12 in Extended Experimental 
Procedures). Experimentally, cellular activation clearly rescues 
attenuated feedback (Figure 6B). Similarly, the ability of Tat 
expression to reactivate latent virus independent of cellular 
activation (Figures 4 and 5) does not imply that in vivo latent re- 
activation occurs absent cellular activation. Rather, the results 
herein demonstrate— contrary to prevailing dogma— that there 
is also an underlying viral program that autonomously regulates 
proviral latency. 

A viral-encoded latency program naturally raises questions 
on the evolutionary origin and function of HIV latency. While 
sensor-actuator circuitry would have been consistent with either 
the epiphenomenon hypothesis or evolutionary hardwiring, an 
autonomous regulatory circuit is invariably hardwired and 
must be selectively maintained— especially in a rapidly evolving 
virus under strong selection. So, how would latency be benefi- 
cial in the natural history of lentiviral infection? In a companion 
paper (Rouzine et al., 2015 [this issue of Cell]), we propose 
that latency may provide a fitness advantage by acting as a viral 
“bet-hedging” strategy to enhance net viral transmission prob- 
ability. An associated aspect is the decision-making architec- 
ture behind latency: Tat positive feedback maintains strong 
expression levels robust to cellular perturbations, while large 
stochastic fluctuations exhibited by the LTR promoter enable 
the system to probabilistically switch (Dar et al., 2012). Notably, 
this architecture has been theoretically proposed to be an unre- 
liable environmental sensor in fluctuating environments (Brand- 
man et al., 2005), suggesting that HIV’s circuit architecture is 
precisely the opposite configuration that would be required for 
a reliable environmental sensor— a reliable sensor would 
respond faithfully to environmental changes— and similar sto- 
chastic positive-feedback circuitry has been proposed for 
autonomous decision making in other biological systems (Jil- 
kine et al., 2011). Overall, viral evolution appears to have 
selected for circuitry that both maintains remarkable autonomy 
from environmental cues and simultaneously drives probabi- 
listic on-off decision making. 

EXPERIMENTAL PROCEDURES 

Primary-Cell Isolation and Cell-Culture Conditions 

Primary 004"^ T cells were isolated from peripheral blood obtained from 
Stanford Blood Bank (Palo Alto, CA) using RosetteSep Human CD4'^ 
T Cell Enrichment Cocktail from STEMCELL Technologies and Ficoll as 
described (Terry et al., 2009). Once isolated, cells were either cultured as 
described (Terry et al., 2009) or frozen in 10% DMSO, 90% culture media 
at a density of 10^ per ml. For infections, primary CD4'^ T cells were 
pre-activated for 2-3 days with aCD3/CD28 beads (Dynabeads, Life Tech- 
nologies) as per manufacturer’s instructions. Cell activation was measured 
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by flow cytometry with anti-CD25-PE-conjugated antibody and anti-CD69- 
APC-conjugated antibody from BD Biosciences. Primary CD4'^ T lympho- 
cytes, Jurkat T Lymphocytes, and OEMs were all cultured in RPMI 1640 
(supplemented with L-glutamine, 10% fetal bovine serum, and 1% peni- 
cillin-streptomycin) in a humidified environment at 37°C and 5% C02- 
Jurkats and CEM were maintained by passage between 2x10® and 2 x 
10® cells/ml. Primary cell media was supplemented with 20 U/ml r-IL2 
(Peprotech, 200-02). 

Computational Modeling 

A simplified two-state model of Tat positive feedback was constructed from 
experimental data of LTR toggling (Dar et al., 2014; Dar et al., 2012; Singh 
et al., 2010; Singh et al., 2012) and simulated using the Gillespie algorithm 
(Gillespie, 1977) to test how altering LTR basal transcription rate or Tat pro- 
tein stability would affect the activity of the circuit. The chemical reaction 
scheme and parameters used are described in Table SI. The outputs from 
simulations are the different molecular species in arbitrary numbers. Sto- 
chastic simulations were run in Mathematica using the xSSA package 
(http://www.wolfram.com/mathematica/ and http://www.xlr8r.info/SSA/). 
Initial conditions for all species were set to 0, except for LTRqn. which 
was set to 1, and simulations were run to time = 200 (arbitrary time units); 
500 simulation runs were conducted for each parameter set. See Extended 
Experimental Procedures for further details and explanation of simulations 
for the more complex model (Figure S2). 

Recombinant Virus Production and Infections 

Lentivirus was packaged in 293T cells and isolated as described (Dull 
et al., 1998; Weinberger et al., 2005). HIV-d2GFP (Jordan et al., 2003) 
was packaged with dual-tropic env-encoding plasmid pSVlll-92HT593.1 
(NIH AIDS Reagents Program). Before infecting primary cells, activation 
beads were removed and cells were mixed with appropriate amount of 
virus (to get <10% infection) in 100 )il media and spinoculated at 32°C 
for 2 hr at 1,000 x g. 

To generate the isoclonal populations with engineered viral circuits, lenti- 
virus was added to Jurkat T Lymphocytes at a low MOI to ensure a single 
integrated copy of proviral DNA in infected cells. Cells were stimulated 
with tumor necrosis factor a (TNFa) and Shield-1 for 18 hr before sorting 
for Dendra-expressing cells. Isoclonal and polyclonal populations were 
created as described (Weinberger et al., 2005). Sorting and analysis of cells 
infected was performed on a FACSAria II. The same procedure was followed 
to create the LTR-Tat-Dendra and LTR-mCherry-IRES-Tat-FKBP cell lines. 
Inducible Tat cells were generated by transducing Jurkat cells with Tet- 
Tat-Dendra and SFFV-rTta lentivirus at high MOI. The cells were incubated 
in Dox for 24 hr and then FACS sorted for Dendra"^ cells to create a poly- 
clonal population. To create the Tet-Tat-Dendra + LTR-mCherry cells, the 
polyclonal population was infected with LTR-mCherry lentivirus at a low 
MOI. Before sorting for mCherry"^ and Dendra"^ cells, Dox was added at 
500 ng/ml for 24 hr, and single cells were FACS sorted and expanded to 
isolate isoclonal populations. The same procedure was followed for the 
Tet-Tat-Dendra-FKBP + LTR-mCherry populations; however, 24 hr before 
the sort, 1 uM Shield-1 and 500 ng/ml Dox was added to the culture. All 
inducible Tat or control HIV infection experiments were performed by incu- 
bating 5x10® CEM cells in the same titer of inducible Tat or the same titer 
of control HIV in the presence or absence of Shield-1 and taking a flow cy- 
tometry time point after 48 hr. A-Tat mCherry infections were carried out 
using 10®-10® inducible Tat (Jurkat) cells in the presence or absence of 
500 ng/ml doxycycline. 

Flow Cytometry and Analysis 

Flow cytometry data were collected on a BD FACSCalibur DxP8, BD LSR II, 
or HTFC Intellicyt for stably transduced lines and primary cells and on a 
BD FACSAria II for replication-competent virus assays and sorting. All 
flow cytometry experiments on replication-competent virus were per- 
formed in BSL3 conditions (safety information available upon request). 
Flow cytometry data were analyzed in FlowJo and using customized 
MATLAB code. 
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SUMMARY 

HIV latency is the chief obstacle to eradicating HIV but 
is widely believed to be an evolutionary accident 
providing no lentiviral fitness advantage. However, 
findings of latency being “hardwired” into HIV’s 
gene-regulatory circuitry appear inconsistent with la- 
tency being an evolutionary accident, given HIV’s 
rapid mutation rate. Here, we propose that latency is 
an evolutionary “bet-hedging” strategy whose fre- 
quency has been optimized to maximize lentiviral 
transmission by reducing viral extinction during 
mucosal infections. The model quantitatively fits the 
available patient data, matches observations of 
high-frequency latency establishment in cell culture 
and primates, and generates two counterintuitive 
but testable predictions. The first prediction is that 
conventional CD8-depletion experiments in SlV-in- 
fected macaques increase latent cells more than 
viremia. The second prediction is that strains engi- 
neered to have higher replicative fitness — via reduced 
latency — will exhibit lower infectivity in animal-model 
mucosal inoculations. Therapeutically, the theory pre- 
dicts treatment approaches that may substantially 
enhance “activate-and-kill” HIV-cure strategies. 

INTRODUCTION 

HIV actively replicates in CD4'’' T lymphocytes but can also enter 
a long-lived quiescent state termed proviral latency in memory 
CD4’’‘T cells (Chun et al., 1997a; Finzi et al., 1997). The popula- 
tion of latently infected cells is relatively small in patients (^^1 in 
10® CD4'’' T cells) and does not generate significant viral RNA 
(Pierson et al., 2000). However, latently infected cells provide a 
critical viral reservoir, which enables lentiviral persistence even 
during prolonged antiretroviral therapy (ART). Further, if patients 
interrupt ART, persisting latent viruses reactivate, driving HIV to 
pre-treatment viral loads within weeks (Richman et al., 2009). 
Consequently, latency is the chief barrier to a curative HIV 
therapy. 
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While latency enables HIV to avoid extinction during ART, the 
benefit of latency prior to the ART era— during the centuries of nat- 
ural lentiviral infections— remains unclear. In fact, latency appears 
to have been deleterious prior to ART since latently infected cells 
produce no virus and decrease patient viral loads. Given latency’s 
reduction of lentiviral replicative fitness, the prevailing hypothesis 
is that latency is an evolutionary accident— an epiphenomenon 
that only results when lentiviruses infect CD4'’' T cells that are tran- 
sitioning from activated to quiescent memory states (Coffin and 
Swanstrom, 201 3; Eisele and Siliciano, 2012; Han et al., 2007). La- 
tency is therefore viewed to be an infrequent bystander effect that 
only occurs after a viral-driven adaptive immune response initiates 
and CD4'’' T lymphocytes begin to form memory subsets. Yet, a 
recent study in Rhesus macaques indicates that latency reaches 
high levels within the first 3 days of infection (Whitney et al., 
2014), which is prior to the generation of an SIV-specific adaptive 
immune response (Kuroda et al., 1999). 

If latency were a non-beneficial viral trait or epiphenomenon, 
one would expect it to have been lost due to natural selection 
or genetic drift, given lentiviruses’ rapid evolutionary rates. Yet, 
a companion study (Razooky et al., 2015 [this issue of Cell]) 
demonstrates that the ability to establish latency is “hardwired” 
into HIV’s gene-regulatory circuitry. This study matches recent 
data showing that ~50% of cell-culture infections— in which 
adaptive immune responses are absent— result in lentiviral la- 
tency (Calvanese et al., 2013; Dahabieh et al., 2013). Further, 
HIV’s auto-regulatory Tat circuit appears optimized to amplify 
stochastic fluctuations in viral gene expression, producing fluc- 
tuations that are sufficient to induce a probabilistic switch to la- 
tency (Burnett et al., 2009; Weinberger et al., 2005; Weinberger 
et al., 2008). In general, stochastic expression noise is thought 
to be selected against and thus filtered out of regulatory circuits 
when not beneficial (Batada and Hurst, 2007; Fraser et al., 
2004). The persistence of a hardwired latency circuit suggests 
an unknown selective advantage, which outweighs latency’s 
putative fitness cost of reducing long-term viral loads. 

One possible selective benefit is that— by providing a long- 
lived viral reservoir— latency could enhance lentiviral survival 
during unfavorable environmental conditions. Similar “bet-hedg- 
ing” hypotheses (Cohen, 1966) have been proposed for bacte- 
riophage-7 lysogeny (Arkin et al., 1998) and bacterial persistence 
(Balaban, 201 1). However, lentiviral latency would only provide a 
bet-hedging advantage if there were risks of viral extinction due 
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to environmental fluctuations. In reality, lentiviruses appear in lit- 
tle danger of population crashes, as they evade immune clear- 
ance and maintain high viral loads of particles/ml of blood 
plasma for years (and lentiviruses clearly did not evolve under 
pressure from antiretroviral drugs). Further, lentiviruses only 
infect a small percentage (~1%-2%) of available target cells, 
making target-cell fluctuations unimportant during chronic infec- 
tion. Nevertheless, viral loads remain low during one phase of the 
lentiviral lifecycle: initial mucosal infection. 

The probability of successful mucosal infection is low, with 
<1 % of unprotected sex acts between HIV-discordant couples 
resulting in self-propagating systemic HIV infections (Fraser 
et al., 2007; Gray et al., 2001; Wawer et al., 2005). When suc- 
cessful infections do occur, they expand from single founder se- 
quences (Kearney et al., 2009; Keele et al., 2008), indicating that 
only one variant in the transmitted quasispecies avoids extinc- 
tion. Further, animal models of HIV capture a consistent 
^6 day delay from experimental mucosal inoculation to self- 
propagating infection (Haase, 2011; Zhang et al., 1999), which 
implies that the first days of lentiviral infection provide conditions 
unsuitable for viral growth. 

The unfavorable conditions of early lentiviral infections typically 
occur in the mucosa, where >90% of HIV infections initiate 
(Haase, 2011). HIV’s evolutionary precursor in non-human pri- 
mates (SIV) also spreads through mucosal transmission— via 
sexual activity or fighting with subsequent communal wound 
licking (Santiago et al., 2005). Mucosal challenge experiments 
in primates with large inoculations provide direct evidence that 
the mucosa are initially unfavorable to lentiviral growth: large 
inoculations of ~10® infectious units (by TCID50) initially burn 
out within ~5 days (Miller etal., 2005). Quantitatively, each initially 
infected cell lives for ~1 day (Markowitz et al., 2003), so the num- 
ber of actively infected cells after 5 days scales with 
wherein is the basic reproductive ratio during early 

mucosal infection. Since actively infected cells crash within 
^5 days (Miller et al., 2005), {Ro^^°f approaches 0, implying 
that ffo™° < < 1 during initial mucosal infection. 

Here, we quantitatively test the hypothesis that latency pro- 
vides a bet-hedging advantage that increases the probability of 
successful lentiviral transmission despite reducing viral loads 
during systemic infection (Figure 1A). The key point is that 
increasing the probability of latency (piaJ increases the probabil- 
ity that each initially infected cell survives initial mucosal infec- 
tion. Yet, increasing piat also decreases viral loads in systemi- 
cally infected hosts, which reduces the inoculum transmitted to 
new hosts. With a higher per-cell survival rate but fewer initially 
infected cells, the question is whether latency’s fitness benefits 
outweigh its costs— which would establish latency as an evolu- 
tionarily beneficial trait that is maintained by natural selection. 

RESULTS AND DISCUSSION 

Mathematical Models of Lentiviral Transmission 
and Rationale for Models 

Three classes of mathematical models are developed to quantify 
the net impact of latency on lentiviral transmission (Figure SI). 
Each class of models generalizes the well-parameterized basic 
model of viral dynamics (Nowak and May, 2000) to include 



both proviral latency and the conditions of early mucosal infec- 
tion (i.e., < 1) during which latency may be critical (Exper- 

imental Procedures). 

The first class of models tracks initial lentiviral infection in the 
mucosa alone (Extended Experimental Procedures, Section A). 
Given the small numbers of infected cells during initial mucosal 
infection, the established model of mucosal infection is stochas- 
tic (Pearson et al., 2011). We analyze this experimentally param- 
eterized stochastic model— and a deterministic approximation 
to this model— to quantify how the probability of viral extinction 
in the mucosa depends on the probability of latency (piat). 

The second class of models extends the single-compartment 
model into a two-compartment model (Figure 1 B) that tracks 
both initial infection in the mucosa and systemic infection in 
the lymphoid tissue (Extended Experimental Procedures, Sec- 
tion B). Importantly, the initial and systemic infection model com- 
partments only differ in a single experimentally measured param- 
eter: F?o (Figure IB and Table SI). Collectively, the models 
predict an optimal value of piat (p[^f = 0.5) that matches latency 
frequencies measured in cell culture (Calvanese et al., 2013; Da- 
habieh et al., 2013) and is consistent with latency levels 
measured in mucosal primate infections (Whitney et al., 2014). 
However, the large value of piat does not match the low fre- 
quencies of latency observed in chronically infected patients 
(Chun et al., 1997b; Ho et al., 2013). 

The third class of models incorporates a canonical immune 
response (Nowak and May, 2000) into the two-compartment 
model (Extended Experimental Procedures, Section C)— since 
a key difference between cell-culture models and chronic infec- 
tion is the presence of an adaptive immune response. Each 
immune parameter added is either tied to a distinct patient- 
measured value or has been measured previously in the literature 
(Table S2). With no added free parameters, the immune model fits 
all available patient data and predicts the same robust p[^f value. 

Latency’s Net Evolutionary Impact Is the Product of Its 
Impact on Both Initial Infection and Systemic Infection 

To calculate the optimal piat value, the two-compartment models 
track latency’s net evolutionary impact across both mucosal and 
systemic infections. While the nonlinear models are complex, we 
decouple latency’s net impact on viral transmission into a prod- 
uct of two factors: (1) the average initial inoculum of infected cells 
per mucosal inoculation (/q), and (2) the probability that an initially 
infected cell establishes systemic infection (Pestab) (Figure 1A). 
This product can be derived analytically when the number of in- 
fected cells is Poisson distributed and when each infected cell 
lineage is statistically independent. Under these two assump- 
tions, the probability of lentiviral transmission per-mucosal inoc- 
ulation (Ptransmission) redUCeS tO. 

Ptransmission ~ 1 ® Pestab o / q [1] 

The equality in Equation [1] is a direct calculation of the Poisson 
probability thataf /east one infected cell in the inoculum Iq estab- 
lishes systemic infection. Critically, Ptransmission < 1 0^^ since < 1 % 
of lentiviral infections result in self-propagating infections (Gray 
et al., 2001 ; Wawer et al., 2005). Given the equality, Ptransmission < 
1 0“^ immediately implies that Pestab /o < "^1 0“^. 
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Figure 1. HIV Latency as a Bet-Hedging 
Strategy for Maximizing Viral Transmission 

(A) Schematic of the lentiviral transmission pro- 
cess. Lentiviral transmission is illustrated as a 
two-compartment process, beginning with viral 
inoculation in the mucosa and progressing— in 
some cases— to systemic infection in the 
lymphoid tissue, where >98% of 004"^ T cells 
reside (Murphy, 2011). The parameter piat reflects 
the probability that an HIV-infected cell enters 
latency. An HIV strain incapable of entering la- 
tency (Plat = 0) would generate increased viral 
loads during systemic infection, transferring more 
virions to new hosts. However, the latency-inca- 
pable virions would rapidly destroy the small 
004"^ T cell population initially present in the 
mucosa of the new host— reducing the proba- 
bility of systemic infection (upper). In contrast, an 
HIV strain capable of entering latency (piat > 0) 
would generate lower viral loads during systemic 
infection, transferring fewer virions to new hosts. 
Yet, the relatively few transferred virions would 
not destroy all mucosal target cells. By entering 
long-lived latency in some mucosal cells, the la- 
tency-capable strain would increase its proba- 
bility of surviving initial infection to establish 
systemic infection (lower). 

(B) Schematic of the two-compartment model of 
lentiviral transmission. The two major processes 
controlling the probability of lentiviral trans- 
mission (Ptransmission) af©: ("I) the inoculum of in- 
fected cells (/o) and (2) the probability that an 
infected cell in the inoculum survives initial 
infection to establish systemic infection (Pestab)- 
(Right to left) HIV enters a host mucosal site, but 
due to the small number of permissive target 
cells in the early mucosa (prior to day 6), Rq < 1. 
To successfully establish systemic infection, the 
virus must avoid extinction until Rq> 1. Critically, 
the likelihood of an actively infected cell or a free 
viral particle surviving until day 6 to initiate sys- 
temic infection is negligible since virus-produc- 
ing cells die within 40 hr of infection and viral 
progeny are cleared from the system ~1 00-fold 
more rapidly. In contrast, latently infected cells 

are long-lived and can reactivate once f?o > 1 to initiate systemic viral expansion. Therefore, despite reducing long-term viral loads, latency may increase 
Ptransmission by increasing viral survival during initial infection. This would make latency evolutionarily beneficial at the population scale. 

See also Figure SI. 
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Having used the equality to establish that Pestab lo < 0“^, we 

can discard the quadratic and higher-order terms in the Taylor 
Series expansion of 8^^““ with negligible impact. This leads 
to the subsequent approximation (i.e., linearization) in Equation 

[*!]■ Ptransmission ^Pestab I 0- 

Given Equation [1], the overall goal of determining whether la- 
tency’s benefits outweigh its costs reduces to quantifying la- 
tency’s impact on Pestab and /q. 

Latency Increases the Probability that an Initially 
Infected Cell Survives Mucosal Infection and 
Establishes Systemic Infection 

To quantify latency’s impact on Pestab, we begin by tracking len- 
tiviral survival during mucosal infection alone. As noted above, 
the first 5 days of mucosal infection are characterized by a 



lack of detectable actively infected cells (Li et al., 2005; 
Miller et al., 2005), indicating that Rq in the mucosa is 

initially < < 1 (Extended Experimental Procedures, Section D). 

< < 1 is also consistent with the infrequency of successful 
mucosal transmissions (ptransmission < 0.01) and the ~6-day delay 
before systemic infection when lentiviral infections do establish 
(Miller et al., 2005). 

Both deterministic differential equations models (Figure 2A) 
and stochastic Monte-Carlo models (Figures S2A and S2B) cap- 
ture the fitness advantage of latency in the mucosa. Model sim- 
ulations are performed with Rq < 1 and an inoculated dose of vi- 
rus that results in a few dozen initially infected cells, matching 
animal mucosal experiments (Haase, 2011; Miller et al., 2005; 
Zhang et al., 1999). The quantitative models show that— in the 
absence of latency— all virions and infected cells are driven 
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Figure 2. An Evolutionary Optimum for Latency 

(A) Numerical solutions to Equation [6] showing the dynamics of latently infected cells in early mucosal infection = 0.25). As piat increases, the number of 

surviving latently infected cells increases. (Inset) The dynamics of actively infected cells in early mucosal infection showing that as piat increases, actively infected 
cells reach extinction more rapidly. 

(B) In systemic infection, = 10), increases in piat decrease the virus load (and, therefore, the viral dose transmitted to the next host). Dynamics in (A and B) are 
calculated numerically from Equation [6], using the parameters in Table SI (r = 0). 

(C) Schematic flowchart of the derivation of the (optimal) latency probability that maximizes Ptransmission- Red text indicates key assumptions made at each 

step of the derivation. For example, < < 1 implies that the vast majority of latently infected cells during initial infection are produced in the first generation, 
leading to the approximation ~ /q. The results of the analytic derivation quantify the tradeoff of latency; increasing piat linearly increases Pestab but 

decreases Iq by the factor (1 -piat)- Since this tradeoff is almost equally balanced, the optimal latency probability, p°^^, approximately equals 0.5. 

(D) Normalized probability of host-to-host transmission (jotransmission) as a function of piat- Results shown are obtained either analytically, from Equation [5] 
(magenta line), or numerically using the plateau levels of actively infected cells (/) and latently infected cells (/.) simulated in A and B (magenta dots). As in C, the 
probability of transmission is maximized when piat ~0.5. 

(E) Normalized probability of host-to-host transmission when systemic infections emerge from non-latent routes (e.g., dendritic cells) with probability fnoniatent > 0 
(Equations [SI 2 and SI 3]). The maximum probability of transmission occurs at slightly lower piat values, but p|^^^ is still large. 

See also Figure S2. 



extinct in the first 5 days of mucosal infection (Figures 2A, inset, 
and S2A). In contrast, low levels of latency enable viral survival 
(Figures 2A and S2B). To test the robustness of these predictions 
across all Ro<^ and /q < 1 00, a continuous-time branching-pro- 



cess model was developed (Grimmett and Stirzaker, 1992). The 
branching-process model (Extended Experimental Procedures, 
Section A) directly computes the viral extinction probability as 
a function of time, providing an efficient alternative to averaging 
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thousands of Monte-Carlo simulations for each f?o and /q. In the 
absence of latency, the viral extinction probability approaches 1 
by day 5 of mucosal infection, except in the small slice when Rq 
~ 1 (Figures S2C and S2D)— which does not match the levels of 
Ro inferred from animal mucosal challenge experiments (Miller 
et al„ 2005). 

For completeness, the surviving number of mucosally infected 
cells was directly computed using a Wright-Fisher model (FlartI 
and Clark, 2007; Extended Experimental Procedures, Section 
A). The Wright-Fisher simulations demonstrate that the surviving 
number of mucosally infected cells increases approximately lin- 
early with Plat for each Iq (Figures S2E-S2G). This linear depen- 
dence can also be derived analytically. Given that Ro^^° < < 1 
during initial mucosal infection, the majority of latently infected 
cells are produced in the first generation of infection (Extended 
Experimental Procedures, Section A). Since these cells are un- 
likely to reactivate during the short duration of initial infection, 
the number of latently infected cells that survive mucosal infec- 
tion is =Piat/o, the latent fraction of the inoculum. Thus, both sim- 
ulations and analytics indicate that increasing piat approximately 
linearly increases the number of infecfed cells that survive initial 
mucosal infection. 

Given that latency appears to increase viral survival in the early 
mucosa, we next tested whether latency increases the probabil- 
ity of systemic infection, which mainly occurs in the lymphoid tis- 
sue where >98% of CD4'^T cells reside (Murphy, 2011). To do so, 
the Wright-Fisher model was extended into a two-compartment 
model that directly captures the two typical stages of lentiviral 
infection: early mucosal infection and systemic (lymphoid) infec- 
tion (Extended Experimental Procedures, Section B). Only a sin- 
gle parameter value is assumed to differ between the early 
mucosal and systemic infection compartments. While is 
parameterized to be <1, Rq during systemic infection in the 
lymphoid tissue (Rq~^) is set to 1 0 to match its value in chronically 
infected patients (Nowak and May, 2000). 

The two-compartment model fits the available human and an- 
imal data of early infections, showing that: (1) only a small frac- 
tion of mucosal infections result in systemic infections (Fraser 
et al., 2007), (2) successful systemic infections emerge after 
~5-7 days (Flaase, 2011), and (3) systemic infections initiate 
from single “founder” infected cells (Kearney et al., 2009; Keele 
et al., 2008). More importantly, the two-compartment model 
directly shows that latency increases the probability (Pestab) of 
systemic infection— with Pestab maximized when piat > 0.6 (Fig- 
ure S2FI; Extended Experimental Procedures, Section E). 

Latency Decreases the Inoculum in a New Host 

While increasing piat increases the probability of systemic 
lymphoid infection for any given inoculum of initially infected 
cells (/o), the probability of lentiviral infection also depends on 
lo itself. Critically, Iq is proportional to the viral load of the trans- 
mitting patient (Extended Experimental Procedures, Equation 
S4). Thus, we can quantify latency’s impact on Iq by measuring 
latency’s impact on viral loads in systemically infected 
patients. 

To track latency’s effect on systemic viral loads, we simulated 
the deterministic model in the lymphoid compartment alone (i.e., 
Rq = 1 0). Initial mucosal infection was not tracked in these simu- 



lations because of the data showing that systemic infections 
emerge from single “founder” viruses independent of the inoc- 
ulum (Kearney et al., 2009; Keele et al., 2008). These data indi- 
cate that mucosal dynamics affect the probability of systemic 
infection, but not the level once established. Thus, we assumed 
the existence of a single founder infected cell and solved Equa- 
tion [6] numerically. Assuming successful systemic establish- 
ment, the systemic infection model shows that increasing piat de- 
creases long-term viral loads (Figure 2B). Consequently, 
increasing the frequency of latency (piat) decreases infection 
inocula {Iq) at the population scale. 

The Evolutionarily Optimal Probability of Latency Is ~0.5 

Given Equation [1], if latency’s benefit to Pestab exceeds its cost 
to Iq, then latency increases the probability of lentiviral transmis- 
sion (Ptransmission)- Mathematically, this net evolutionary benefit of 
latency can only occur if the (evolutionarily optimal) value of piat 
that maximizes Ptransmission is greater than 0. Flere, we test 
whether the maximizing value of piat is greafer than 0, directly 
quantifying latency’s net evolutionary benefit. 

We first derive Pestab as a function of piat. After initial mucosal 
infection, only latently infected cells persist, with the number of 
surviving latently infected cells defined to be . As noted 
above, due to Rq'^^° < < 1, the majority of mucosal latent 
infections emerge in the first generation of infecfion, making 
~ Piat/o (Figures 2A, S2F, and S2G). At least one of these 
surviving infected cells must be reactivated (with probability 
Preact) to establish systemic infection. Thus, the per-inoculum 
probability of establishing systemic infection is: 

Pestab — I ~i 1 Preact — PlatPreact [2] 

Equation [2] emerges from the result that only latently infected 
cells survive initial infection in the mucosa (Figures 2A and S2A- 
S2E). To demonstrate robustness, below we introduce a 
“leakage” probability (fnoniatent) that reflects the fraction of sys- 
temic infections that are established by non-latent cells— 
including Langerhans dendritic cells, actively infected cells, 
and free virions. 

We nexf solve for Iq as a function of piat. As noted above, the 
average infectious dose (i.e., Iq) that can be transmitted to a 
new individual is directly proportional to the time integral of the 
viral load — / V(t)dt, Equation [S4]— over the duration of systemic 
infection (Nowak and May, 2000). Analytically solving this time 
integral yields (Extended Experimental Procedures, Section B): 

lo ~ consf(piat) [(1 - P\at)RV - 1 ] [3] 

The constant term in Equation [3] only implies constant in 
Plat— it may depend on other parameters. Further, Equation [3] 
is solved under the assumption that latently infected cells rarely 
reactivate prior to cell death (i.e., r< < c/l in Table SI). This con- 
servative assumption reduces the optimal level of latency by 
presuming that latently infected cells generally die before 
contributing to viral loads. Given this maximal fitness cost, la- 
tency reduces the reproductive ratio during systemic infection, 
Rq'~^, by the factor (1 - piat). 
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By combining Equations [1-3], Ptransmission emerges as a func- 
tion of Plat (Figure 2C): 

Ptransmission ^Pestab fo — eOnSf(Piat)PreactPlat [(1 ~ Plat)F?g ~ 1] [4] 

Equation [4] shows that, for each vaiue of Rq-^, the probability 
of viral transmission has an optimum at a specific piat- To 
analytically derive this optimum, we make the simplifying 
assumption that Preact is constant in piat- This makes 
Ptransmission “Piat -[(1 - Piat)F?o^ “ 1]- Differentiating the simplified 
transmission probability with respect to piat yields the following 
optimal probability of latency, denoted 



,opt^ 1 - (1 W 

lat 2 



[5] 



Strikingly, for a typical value of Rq'~^ nofo (Nowak and May, 
2000), pj^f = 0.5 is the probability of latency that maximizes len- 
tiviral transmission (Figure 2C). 

In agreement with these analytic derivations, numerical solu- 
tions also show that Ptransmission haS an Optimum at Plat = 0.5 
(Figure 2D). The numerical simulations are generated by directly 
calculating / V(t)dt in model runs, rather than approximating it via 
Equation [3]. Sensitivity analyses show that this optimum at 
Plat = 0.5 exists across the entire observed range of Rq^^ values 
(Figure 2D). 



Large Optimal Latency Probability Is Robust to Changes 
in Model Assumptions 

The main prediction of a large p[^f value remains valid even if 
one removes key mathematical assumptions. In particular, the 
two-compartment Wright-Fisher model (Extended Experimental 
Procedures, Section B) inverts the assumption that Preact is 
constant in piat, allowing Preact to strongly decrease in piat. 
Even in this extreme scenario— in which latency has a substan- 
tial fitness cost beyond its reduction of viral loads during sys- 
temic infection— pj^f >1/3 (Figure S2I). Similarly, the large 
p[^f value remains valid when one relaxes the assumption 
that only latently infected cells seed systemic infections. To 
show this, we analytically re-calculated pj^f when a fraction 
(fnoniatent) of successful infections are established via non-latent 
routes (Extended Experimental Procedures, Section E). Even if 
80% of lentiviral transmissions are established via non-latent 
routes, pj^f = 0.1. More generally, as long as fnoniatant is less 
than 100%, latency remains evolutionarily beneficial (Figures 
2E and S2J). 

Strikingly, relaxing other model assumptions increases the 
large p[^f value. For example, relaxing the assumption that 
latently infected cells die prior to reactivation (i.e., r < < 
di) reduces the cost of latency during systemic infection 
and therefore increases the optimal latency probability. In 
fact, if r > Pl, pj^f = 1 (Extended Experimental Procedures, 
Section E). Further, if lentiviral transmissibility saturates at 
high viral loads (Fraser et al., 2007)— so that latency’s 
decrease of steady-state viral loads does not decrease 
/q- then p[^^* would again equal 1, due to the absence 
of a cost to latency (Extended Experimental Procedures, 
Section E). 



Simplified Two-Compartment Model Fits the High 
Frequencies of Latency Measured in Experimental 
Models 

The predicted value of p[^^’ ~0.5 matches the latency fre- 
quencies of 50% (Dahabieh et al., 2013) or higher (Calvanese 
et al., 2013) measured in cell culture. p[^f ~0.5 is also consis- 
tent with a recent in vivo study in Rhesus macaques, in which a 
large reservoir of latently infected cells is documented on day 3 
of mucosal infection (Whitney et al., 2014). Flowever, pj^f ~0.5 
is inconsistent with the low latency frequencies measured in 
chronically infected patients. Only 1 in 10®-10^ patient CD4* 
T cells appear to be latently infected (Chun et al., 1997a; Seda- 
ghat et al., 2007). This has led to estimates of piat ~10^® -10“"^ 
(Rong and Perelson, 2009a; Sedaghat et al., 2007). While more 
recent studies indicate that the latency frequency in patient 
cells is ^60-fold higher (Flo et al., 2013), this still leaves piat < 
< 0.5 during chronic infection. Below, we show that the dichot- 
omy between latency’s high frequency in early infection and 
cell culture and latency’s low frequency in chronic infection 
can be explained by the onset of the adaptive immune 
response. 

Mathematical Models Incorporating the Immune 
Response Are Required to Explain the Divergent 
Latency Frequencies between Experimental Models 
and Patients 

Unlike early mucosal infections or cell-culture infections, chronic 
lentiviral infections contain an FllV-specific adaptive immune 
response (Turnbull et al., 2009). Previous work has shown that 
this adaptive immune response must be incorporated into the 
basic model of viral dynamics (De Boer and Perelson, 1998; 
Nowak and May, 2000) to fit the 2-3 log drop in viral loads be- 
tween the viral peak during acute infection and the viral set point 
established during chronic infection (Stafford et al., 2000). We 
hypothesized that incorporating a canonical adaptive immune 
response (De Boer and Perelson, 1998; Nowak and May, 2000) 
would also be necessary to observe the reduced level of latently 
infected cells documented during chronic infection. 

A substantial body of literature suggests that the model as- 
sumptions that Plat and rare constant must be relaxed to account 
for the adaptive immune response. In particular, the activation 
levels of CD4'’' T cells appear to increase during chronic infection 
in vivo, as is measured by the expression levels of three activa- 
tion markers (Li et al., 2005) and the increased turnover rates 
of CD4'’' T cells (Mohri et al., 1998). While the exact mechanism 
is unknown, one potential driver of CD4'’' T cell activation is the 
body’s homeostatic response to the depletion of CD4’’‘ T cells 
during acute infection (Mohri et al., 1998). Another potential 
mechanism is CD8'’' T cells’ secreting activating cytokines 
such as TNF-a (Murphy, 2011). Whatever the mechanism, 
cellular activation factors sharply decrease piat and sharply acti- 
vate FIIV transcription (Calvanese et al., 2013; Chun et al., 1998; 
Siliciano and Greene, 2011), for example, by accumulating tran- 
scription factors (e.g., NF-kB) that activate the FIIV LTR pro- 
moter. Further, in the companion study (Razooky et al., 2015), 
mathematical modeling shows that cellular activation levels 
bias FIIV circuit output (i.e., piat and r), even though latency is 
hardwired into the circuit. 
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Figure 3. Incorporating the Immune Response Explains the Diver- 
gent HIV-Latency Frequencies between Experimental Models and 
Patients 

(A) Extended model of systemic HIV infection, which includes CDS"^ T cells (E) 
that kill actively infected cells (or suppress viral replication) and activate 
latently infected cells (Equations [S9] and [S10]). 

(B) The latency probability (piat) and reactivation rate (r) change dramatically 
around the time of the viremia peak due to the immune response (e.g., due to 
bystander cytokine activation by immune cells, Equation [S10]). Inclusion of 
immune cells into the model is capable of interpreting the low incidence of 
latently infected cells in chronically infected patients. 



Since an adaptive immune response is associated with an in- 
crease in CD4'^ T cell activation levels (Li et al., 2005) that re- 
duces Plat and increases r (Calvanese et al., 2013; Chun et al., 
1998; Siliciano and Greene, 2011), we hypothesized that the 
adaptive-immune response could be responsible for the 
reduced piat levels in chronically infected patients (Figure 3A). 
This hypothesis was quantitatively tested by allowing piat and r 
to vary as functions of the effector CDS"^ T cell concentration, 
£[t] (Extended Experimental Procedures, Section C). Before 
the initiation of the adaptive-immune response (i.e., before 
chronic infection), the model naturally generates high latency 
probabilities of ~0.5 and low reactivation rates, as in the simpli- 



fied models above. However, after the viremia peak, cellular acti- 
vation (Li et al., 2005) and cell death (Doitsh et al., 2010) become 
substantial, increasing r{E[f\) to high levels and decreasing piat(£ 
[f]) to low levels (Figure 3B). As a result, the immune model mech- 
anistically explains the divergent latency frequencies measured 
between experimental models (cell culture and non-human pri- 
mates) and chronically infected patients (Figure 3B). 

Models Incorporating the Immune Response Fit 
Available Patient Data while Retaining the Robust 
Optimal Latency Prediction 

While the immune-response modei interprets the low levels of piat 
measured during chronic infection, validation against all available 
patient data is a criticai test of the model. Thus, we tested whether 
the model could recapitulate extant patient data on: (1 ) viral loads 
before ART (Fraser et al., 2007), (2) effector T cell concentrations 
before ART (Turnbull et al., 2009), (3) latently infected cells before 
ART (Chun et al., 1997b), and (4) latently infected cells after ART 
(Finzi et al., 1999). Strikingly, the extended immune-response 
model is abie to fit these four data plateaus (Figure 4A), using 
established parameter estimates (Table S2). In particular, the im- 
mune-response model reproduces the depressed latent reser- 
voir of ~10® cells measured in chronically infected patients. 
Further, the modei captures the ~1 log drop in the iatent reservoir 
under ART (Figure 4A), because ART leads to antigen depietion. 
This causes the immune-cell population to contract and the reac- 
tivation rate r{t) to decrease to its low background level. To be 
sure that these fits were not artifacts due to model complexity, 
we also tested simplified immune response modeis (Extended 
Experimental Procedures, Section E). While these simplified 
models fit the four steady-state plateaus, they cannot reproduce 
the pre-steady-state kinetics measured in patients (Figure S3). In 
contrast, the full immune model fits both steady-state and pre- 
steady-state kinetics (Figure 4A, inset), including the viral decay 
kinetics measured in patients who undergo ART (Markowitz 
etal.,2003). 

Criticaily, the level of the adaptive immune response does not 
change the prediction of the simplified model (i.e., the model 
without an immune response) that the initial latency probability 
Piat(O) has a large optimum of ~0.5 (Figures 4B and S3). As a 
resuit, the prediction of the high optimal latency probability is 
directly applicable to natural lentiviral hosts even if they exhibit 
depressed immune responses. Further, as in the simplified 
models lacking an immune response, the large value is pre- 
served even when a large fraction of systemic infections are 
mediated by non-latent cells (Extended Experimental Proce- 
dures, Section E). The optimal latency prediction is also robust 
to perturbations of epidemioiogicai assumptions, such as the 
monotonic dependence of ientivirai transmission on virai loads 
(Extended Experimental Procedures, Section E). Overall, the 
robustness of in the immune model matches the robustness 
of pj^f in the simplified models. 

Experimental Depletion of CDS'^ T Cells in SIV-Infected 
Macaques Will Increase the Latent Reservoir ^3 Logs 
More Than Viremia 

The immune model argues that CDS"^ T cells depress the latent 
reservoir during chronic infection— either directly (e.g., through 



1 008 Cell 160, 1 002-1 01 2, February 26, 201 5 ©201 5 Elsevier Inc. 





Cell 





Latency Probability p,at(0) 



Figure 4. The Extended Immune-Response Model Fits the Available 
In Vivo Data and Does Not Change the Optimal Latency Probability 
for Resting Cells, (0) 

(A) Dynamics of cell compartments during systemic infection calculated from 
Equations [S9] and [S10], Antiretroviral therapy (ART) initiated during steady- 
state infection causes a decline of the latent reservoir (/.). The saturation of the fall 
in the latent reservoir is due to the decline in immune cells (E) during ART. (Data 
points across human patients) Virus load prior to ART (Fraser etal., 2007) (green 
triangles); latent cells prior to ART (Chun etal., 1997b) and after highly active ART 
(Finzi et al., 1 997) (cyan triangles); effector CDS T cells (Turnbull et al., 2009) (red 
triangles). For each data set (triangles), box-and-whisker plots show the upper 
and lower quartiles of the patient data. (Blowout) Virus load afterthe onset of ART 
(Markowitz et al., 2003) (green triangles, error bars show SD). 

(B) Normalized transmission rate Ptransmission as a function of piat(O) calculated 
from the dynamics in A and Equation [1 ]. Two cases are shown for comparison; 
with immune cells (E, green triangles) and without immune cells (E = N = 0, blue 
curve). Inclusion of immune cells into the model only weakly affects the pre- 
diction of a large optimal latency probability for resting cells, p|^f (0) ~0.5. 
Model parameters in A and B are in Tables S1 and S2 (with =15 and 
Piat(O) = 0.5 in A). See also Figure S3. 

secreted cytokines) or indirectly (e.g., through activation of 
downstream cell types that secrete factors). Thus, a direct test 
of the model can be achieved by depleting CD8* T cells with 
anti-CD8 antibodies. CD8 depletion should increase the latency 
probability (piat) toward its original high value of ~0.5 and 
concomitantly decrease the reactivation rate (r) toward its orig- 
inal low value. In fact, the model quantitatively predicts the 
outcome of this experiment. Whereas previous CD8 depletion 
studies have already measured an ~1-3 log increase in the num- 
ber of actively infected cells following CD8 depletion in SlV-in- 
fected Rhesus macaques (Jin et al., 1999; Metzner et al., 2000; 
Schmitz et al., 1999), the model predicts that the latent reservoir 
will increase by ~5 logs following CD8 depletion (Figure 5A). 
Thus, the increase in the latent reservoir would be ^3 logs 
greater than the increase in actively infected cells and viremia 



(Figure 5B). A corollary prediction is that CD8 depletion during 
early pre-peak infection (Matano etal., 1998), prior to a high-level 
adaptive immune response, will only increase the latent reservoir 
~2- to 3-fold and will thus be harder to reliably measure (Fig- 
ure S4). Notably, these experimental tests of the model require 
viral outgrowth assays (Finzi et al., 1997) since directly 
measuring proviral DNA will only report on actively infected cells, 
which outnumber latently infected cells by orders of magnitude. 
A viral outgrowth assay post-CD8 depletion would provide quan- 
titative verification of the model and would consequently test the 
model’s output that latency is a viral bet-hedging strategy tuned 
by natural selection. 

Viral Strains Engineered to Have Higher Replicative 
Fitness — via Reduced Latency — Will Exhibit Lower 
Infectivity in Animal-Model Mucosal Inoculations 

A more direct experimental test of the model would involve 
mucosal challenge experiments using recombinant SIV strains 
engineered to have substantially reduced latency probabilities. 
Engineering strains with reduced latency efficiencies appears 
possible since different HIV-1 clades are already known to 
exhibit different latency frequencies. These clade-specific differ- 
ences appear to be driven by c/s elements within the HIV-1 LTR 
(Jeeninga et al., 2008; van der Sluis et al., 2011). The model 
directly predicts that the reduced-latency recombinants will 
establish self-propagating systemic infections less frequently 
than the wild-type strains maintaining high latency frequencies. 
Further, these reduced latency strains could be quantitatively 
tested for increased replicative fitnesses via competitive growth 
assays with wild-type strains. If decreasing latency both 
increased replicative fitness and decreased successful lentiviral 
transmission, this would directly show that proviral latency pro- 
vides a bet-hedging advantage that increases viral transmission 
despite reducing steady-state viral loads. 

Proviral Latency Contrasted with Alternate Mechanisms 
of Initial Viral Survival 

A natural question is whether alternatives to latently infected 
CD4+ T cells exist that also increase the probability of initial viral 
survival in the mucosa. One proposed non-latent route is den- 
dritic cell migration from the mucosa to the target-cell rich 
lymphoid tissue (Kahn and Walker, 1998; Wu and KewalRamani, 
2006). More specifically, Langerhans dendritic cells present in 
the mucosa can be infected by HIV and are prone to migration 
to the lymphoid tissue, where they can support subsequent 
dissemination of HIV by c/s transfer (Peressin et al., 2014). Yet, 
Langerhans cells’ dissemination of HIV may be partially blocked 
by neutralizing antibodies (Su et al., 2012). Follicular dendritic 
cells may provide another route of viral survival; however, these 
cells do not migrate to the mucosa (Murphy, 2011). In contrast to 
dendritic cells, proviral latent cells are neither impacted by 
neutralizing antibodies (being quiescent) nor blocked by the 
mucosal barrier, which has been proposed to be a viral bottle- 
neck (Haaland et al., 2009). Latency can thus act as a type of 
“Trojan horse” for the virus. More fundamentally, even if alterna- 
tive routes of initial viral survival exist, the results of this study 
(i.e., p[^f >0) remain robust as long as latency seeds some frac- 
tion of systemic infections (Figures 2E and S2J). 
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Figure 5. Depletion of CD8^ T Cells in SIV-ln> 
fected Macaques Is Predicted to Increase 
the Latent Reservoir Significantly More 
Than Viremia 

(A) Predicted dynamics in systemic infection for the 
extended model (Equations [S9] and [S10]). Data 
points and parameters are as in Figure 4, with the 
upper and lower quartiles of the patient data (tri- 
angles) shown in box-and-whisker plots. 

(B) The ratio of virions to latently infected cells will 
be inverted following CDS"^ T cell depletion (post- 
depletion corresponds to day 125 in A). The dra- 
matic 2-log increase in viremia has been observed, 
as shown by the data points at 1 week post- 
depletion in Jin et al. (1999) and Schmitz et al. 
(1999). The dashed horizontal line at 10“^ RNA/ml/ 
cell corresponds to a 1:1 ratio of latently and 
actively infected cell counts. Blue bars correspond 
to the parameters and compartment sizes in the 

simulation example in A. The maximal expected errors (vertioal bars) are estimated from the whisker box borders in A (the two middle quartiles). Since the dynamic 
balance between actively infected cells and latently infected cells is modulated by piat and r, the depletion of immune cells affecting piat and r is predicted to 
change this balance and disproportionately increase the latent reservoir. 

See also Figures S4 and S5. 
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Suppressing Latent Reactivation in the First Week of 
Infection Could Substantially Reduce the Latent 
Reservoir, Enhancing “Kick-and-KiU” Therapy 

The model presents a potential therapeutic strategy that ex- 
ploits the need for latently infected cells to reactivate to both 
establish systemic infection and dramatically increase the 
size of the latent reservoir (Figure S5). Thus, if the early reac- 
tivation rate were reduced— for example, by suppressing an- 
tigen-presenting cell (ARC) migration (Peressin et al., 2014) or 
HIV transcriptional reactivation (Weinberger et al., 2008)— 
systemic infection would be rendered less likely and the 
latent reservoir size would be substantially decreased (Fig- 
ure S5). While a caveat of this proposed approach is detec- 
tion and treatment within the first week of infection, similar 
early treatments have been achieved; for a review, see 
Haase (2011). Critically, a substantially smaller latent reser- 
voir of ~10^ cells would require the reactivation of far fewer 
latent cells by imperfect “shock-and-kill” strategies (Archin 
et al., 2012; Deeks, 2012). As a result, suppression of reac- 
tivation during the first week of infection followed by shock 
and kill could substantially enhance the chances of HIV 
eradication. 

Implications for Alternate Antiviral Therapy Approaches 

A further implication of the result that latency is a hardwired, 
evolutionarily maintained trait is that it may be easier to control 
HIV by increasing, rather than purging, the latent reservoir (Dar 
et al., 2014; Weinberger and Weinberger, 2013; Weinberger 
et al., 2008). Current shock-and-kill therapies are fighting natu- 
ral selection in attempting to reactivate each of ~10^ latent 
cells. In contrast, discovering a non-toxic compound that 
switches 90%-95% of actively infected cells to latency would 
drive HIV’s basic reproductive ratio (Rq) below 1, making HIV 
infection unsustainable. While still a hypothetical avenue, 
enhancing viral latency may provide a viable alternative if 
shock-and-kill strategies fail to achieve their goal of complete 
eradication. 



EXPERIMENTAL PROCEDURES 



A Simplified Two-Compartment Modei to Quantify the Net Impact of 
Latency on Lentiviral Transmission 

All models described in the main text are variations of the weli-parameter- 
ized basic model of viral dynamics (Nowak and May, 2000) expanded to 
inciude latent infections (Rong and Perelson, 2009a, 2009b; Sedaghat 
et ai., 2007, 2008). Absent an immune response, the deterministic 
form of the modeis is captured by the foliowing ordinary differential 
equations: 



Uninfected 'target' ceiis 



dT 

dt 



drT 



kVT 



replenishment natural death infection 



Actively infected cells (1 —piat)kVT - d,l + rL 



active infection death reactivation 



dL 



Latentiy infected ceiis — = piatkVT —diL 
dt ... 



rL 



Virus 



dV 

iLt' 



latent infection death reactivation 



-= ndil — cV 



production clearance 



In the model above, uninfected "target” cells (7) are produced at rate b, 
decay at rate dj, and can be infected by virus particles (V) at rate k. Upon viral 
infection, target cells become either latently infected cells (/_) with probability 
Plat or become actively infected (virus-producing) cells (/) with probability 1 — 
Plat- Latently infected cells reactivate into actively infected cells at rate r or 
die at the (slow) rate d\_. Actively infected cells produce “burst sizes” of n vi- 
rions as they die at rate d\. Virions decay at the relatively fast rate c. All param- 
eter values are given in Table SI ; Table S2 contains parameters for the model 
extended to include an adaptive immune response (Extended Experimental 
Procedures, Section C). 

Critically, the infection models can be simplified by re-parameterizing the 
equations in terms of the basic reproductive ratio: Ro = bkn/cdr. This “non-dimen- 
sionalization” enables us to capture the disparate dynamics between mucosal 
infection (Figure 2A) and systemic infection (Figure 2B) by simulating the same 
model for both infection stages and only varying a single parameter, Rq. Further, 
Rq"'^^ is experimentally bounded to be < < 1 from the viral dynamics during initial 
infection (Miller et al., 2005), and Rq^^ is similarly measured to be ~1 0 during sys- 
temic infection (Nowak and May, 2000). As a result, no assumptions about un- 
known parameter values are needed to obtain the optimal latency probability 
More directly. Equation [5] shows that (pJ2^) only depends on Rq^^ 
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(for detailed derivations and tests of the models, see Extended Experimental 
Procedures). 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures, five 
figures, and two tables and can be found with this article online at http://dx. 
doi.org/10.1016/j.cell.2015.02.017. 
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SUMMARY 

Aging is a complex process that affects multiple 
organs. Modeling aging and age-related diseases in 
the lab is challenging because classical vertebrate 
models have relatively long lifespans. Here, we 
develop the first platform for rapid exploration of 
age-dependent traits and diseases in vertebrates, 
using the naturally short-lived African turquoise 
killifish. We provide an integrative genomic and 
genome-editing toolkit in this organism using our 
de-novo-assembled genome and the CRISPR/Cas9 
technology. We mutate many genes encompassing 
the hallmarks of aging, and for a subset, we produce 
stable lines within 2-3 months. As a proof of princi- 
ple, we show that fish deficient for the protein subunit 
of telomerase exhibit the fastest onset of telomere- 
related pathologies among vertebrates. We further 
demonstrate the feasibility of creating specific ge- 
netic variants. This genome-to-phenotype platform 
represents a unique resource for studying vertebrate 
aging and disease in a high-throughput manner and 
for investigating candidates arising from human 
genome-wide studies. 

INTRODUCTION 

Aging is the number one risk factor for many human pathoi- 
ogies, inciuding diabetes, cancer, cardiovascuiar, and neuro- 
degenerative diseases (Niccoii and Partridge, 2012). Thus, 
deiaying aging couid help postpone the onset of these devas- 
tating ailments and increase healthspan. Because aging 
affects multiple organs and systems in humans (Lopez-Otin 
et al., 2013), it is one of the most challenging processes 
to model in the lab. So far, the study of aging has been 
dominated by non-vertebrate short-lived model organisms, 
such as yeast (C. cerevisiae), worm (C. elegans), and fly 
(D. melanogaster), which has allowed the identification of 
remarkably conserved aging-related pathways, such as the 
TOR and Insulin/IGF pathways (Kenyon, 2010). However, 
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some important aspects of human aging and disease pheno- 
types cannot be faithfully recapitulated in invertebrate models, 
as they lack specific organs and systems (e.g., blood, bones, 
and an adaptive immune system) that are crucial components 
of human aging and age-related pathologies. Vertebrate model 
systems, namely the mouse (M. musculus) and zebrafish 
(D. rerio), have also been exploited to probe genes involved 
in aging and age-related diseases. However, experimental 
studies have been hampered by the relatively long lifespan of 
mice and zebrafish (maximal lifespan of 3-4 and 5 years, 
respectively [Tacutu et al., 2013]) and high costs of mainte- 
nance, especially for mice. Mouse models with accelerated 
onset of age-associated disease (e.g., neurodegeneration) 
can partially address this issue (Trancikova et al., 2011), but 
these models uncouple the disease phenotype from its main 
risk factor— aging— and they remain expensive to use. Thus, 
a new vertebrate model is needed to better understand the 
principles of vertebrate aging and to study age-related dis- 
eases in the context of aging. 

The African turquoise killifish Nothobranchius furzeri is a 
naturally short-lived vertebrate that lives in ephemeral water 
ponds in Zimbabwe and Mozambique (Figure 1A), where water 
is only present during a brief rainy season. This fish species has 
likely evolved a compressed life cycle (as short as 30-40 days 
from egg to egg-laying adult) to adapt to its transient habitat. 
The turquoise killifish is currently the shortest-lived vertebrate 
that can be bred in captivity (Genade et al., 2005; Valenzano 
et al., 2006), with a lifespan of 4-6 months in optimal laboratory 
conditions (6 to 10 times shorter than the lifespan of mice and 
zebrafish, respectively). Importantly, despite its short lifespan, 
this fish recapitulates typical age-dependent phenotypes and 
pathologies such as decline in fertility, sarcopenia, cognitive 
decline, and cancerous lesions (Di Cicco et al., 2011; Genade 
et al., 2005; Valenzano et al., 2006). It also displays a conserved 
response to environmental stimuli known to affect the aging 
rate in other species, such as dietary restriction (Terzibasi 
et al., 2009). These characteristics make this fish an attractive 
model organism to study vertebrate aging, physiology, and 
age-dependent diseases throughout organismal lifespan (Di 
Cicco et al., 2011). Furthermore, the turquoise killifish telomeres 
are similar in length to those of humans (6-8 kb) (Hartmann 
et al., 2009), unlike laboratory mouse telomeres, which are 
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Figure 1. A Versatile Platform for Rapid Exploration of Aging and Longevity Genes in the Naturally Short-Lived Turquoise Killifish 

(A) Lifespan of non-vertebrate and vertebrate model systems widely used for aging and disease research (top), when compared to the iifespan of the turquoise 
killifish (bottom). The turquoise killifish originates from ephemeral water ponds in Zimbabwe and Mozambique (bottom). 

(B) Examples of genes encompassing the hallmarks of human aging (modified with permission from Lopez-Otin et al. [2013]). 

(C) Genomic pipeline to generate CRISPR/Cas9 gRNAs in a new model organism using our newly created genomic tools (de-novo-assembled turquoise killifish 
genome, epigenome, and transcriptome). Gene models and gRNA selection are available via CHOPCHOP. 

(D) CRISPR/Cas9 genome-editing pipeline to generate stable mutant fish lines in the turquoise killifish. Overall, the total time for generating a stable mutant line in 
the lab (i.e., steps 1-4) is about 2-3 months. 



very long (50-150 kb) (Lee et al., 1998). Thus, findings from ag- 
ing studies In the turquoise killifish should be relevant for verte- 
brate aging, including humans. The rapid timescale of aging in 



this species should not only facilitate the causative identifica- 
tion of factors regulating vertebrate lifespan but also allow lon- 
gitudinal studies. 
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The turquoise killifish has additional advantages as a model 
system. Contrary to many other fish, including zebrafish, the tur- 
quoise killifish has an XY-based sexual determination (Valenzano 
et al., 2009). Furthermore, there exists a highly inbred strain of 
the turquoise killifish (the GRZ strain, used in this study), as 
well as a number of wild-derived strains (Terzibasi et al., 2008). 
The availability of multiple strains provides an important advan- 
tage for genetic studies and for mapping traits that are different 
between strains (e.g., color, maximal lifespan) (Kirschner et al., 
2012; Valenzano et al., 2009). Collectively, these characteristics 
of the turquoise killifish— coupled with the ease of rapidly gener- 
ating many offspring and low maintenance costs— make this fish 
a promising vertebrate model, uniquely fit to address aging and 
age-related diseases (Genade et al., 2005; Valenzano et al., 
2006). 

For the African turquoise killifish to become a widely used 
vertebrate model compatible with high-throughput approaches, 
key tools need to be created. Although preliminary genetic tools 
have been developed in the turquoise killifish, including genetic 
linkage maps (Kirschner et al., 2012; Reichwald et al., 2009; Va- 
lenzano et al., 2009), and Tol2-based transgenesis (Flartmann 
and Englert, 2012; Valenzano et al., 2011), the lack of a 
sequenced genome and ability to manipulate endogenous genes 
has drastically limited the use of this organism. The RNA-guided 
CRISPR (clustered regularly interspaced short palindrome re- 
peats) associated Cas9 nuclease (Jinek et al., 2012) has recently 
emerged as an effective approach for introducing targeted muta- 
tions in a variety of model organisms, such as yeast, worms, flies, 
zebrafish, and mice, as well as several non-model organisms (for 
a detailed list see Hsu et al. [2014]). However, genome-editing 
approaches have never been reported in the African turquoise 
killifish, probably because of the lack of a sequenced genome. 

Here, we create the first platform for the rapid exploration of 
aging and aging-related diseases in vertebrates by developing 
new genomic and genome-editing tools in a promising verte- 
brate model, the naturally short-lived African turquoise killifish. 
As a proof of principle for the versatility of this platform, we 
generate a suite of mutated alleles for 13 genes encompassing 
the hallmarks of aging and report six stable lines to date. We 
characterize a loss-of-function mutation in the gene encoding 
the protein component of telomerase and show that telome- 
rase-deficient turquoise killifish recapitulate characteristics of 
human pathologies. This platform should allow high-throughput 
studies on aging and longevity in vertebrates, as well as longitu- 
dinal modeling of human diseases. Our platform should also 
enable systematic examination of unexplored candidates identi- 
fied in human genomic studies. 

RESULTS 

A Platform for the Study of “Hallmarks of Aging” Genes 
in Vertebrates 

We sought to create a versatile platform to rapidly model human 
aging and diseases in the short-lived turquoise killifish (Figure 1 A). 
A recent review has categorized nine “hallmarks of aging” (L6- 
pez-Otin et al., 2013), including telomere attrition, deregulated 
nutrient sensing, and stem cell exhaustion (Figure IB). We se- 
lected 13 genes encompassing those hallmarks (e.g., the protein 



subunit of telomerase [TERT\, insulin-like growth factor 1 recep- 
tor [IGF1R], and S6 kinase [RPS6KB1]) with the overall goal of 
generating mutant alleles for each of them (Figure IB). 

Because the turquoise killifish is an emerging model, the first 
step was to identify genes in this organism. To this end, we gener- 
ated a wide range of genomic data sets and designed a tailored 
genomic pipeline. We built gene models, using our recently 
assembled turquoise killifish genome (Figure 1C). We verified 
the accuracy of gene models and analyzed mRNA expression 
pattern using our RNA sequencing (RNA-seq) data sets from 
four tissues (Figure 1C). We generated an H3K4me3 chromatin 
immunoprecipitation sequencing (ChIP-seq) data set to define 
transcriptional start sites (TSSs) and further support annotations, 
especially for non-coding RNAs (Rinn and Chang, 2012) (Fig- 
ure 1C). Additional support for protein-coding gene annotation 
was obtained using protein homology (Figure 1C). Finally, we de- 
signed guide RNA (gRNA) targets for CRISPR/Cas9 genome 
editing (Figure 1C and Table SI). The genome of the turquoise 
killifish and the RNA-seq and H3K4me3 ChIP-seq datasets are 
provided as resources (accession numbers JNBZOOOOOOOO, 
SRP041421, and SRP045718, respectively). The full description 
and analysis of the genome will be reported elsewhere (D.R.V., 
B.A.B., P.P.S., and A.B., unpublished data). The gene models 
and gRNA design are made available via the CHOPCHOP plat- 
form (https://chopchop.rc.fas.harvard.edu/) (Montague et al., 
2014). Together, these data sets provide an integrative resource 
for the scientific community, not only to target specific genes in 
the turquoise killifish, but also for comparative genomics and 
evolutionary studies of aging and longevity. 

We then designed a CRISPR/Cas9 genome-editing strategy in 
the turquoise killifish. Based on ourtailored genomic pipeline, we 
generated two to five independent gRNA sequences for each 
gene (Table SI). We then microinjected a mixture of Cas9 
mRNA and gRNAs (Hwang et al., 201 3; Jao et al., 201 3) into fertil- 
ized turquoise killifish eggs at the single-cell stage (Figure 1 D). 
Cas9 is known to introduce double-strand breaks that are re- 
paired by non-homologous end joining (NHEJ), resulting in 
genome editing (small deletions or insertions, also known as in- 
dels) (Hsu et al., 2014). Successful editing was assessed in a 
subset of eggs by cloning and sequencing of the targeted region 
72 hr after injection (Figure 1 D, step 1). The gRNAs that resulted 
in successful editing were then used to generate FO chimeras 
that were crossed with wild-type fish to generate FI embryos 
(Figure 1D, step 2). Successful germline transmission was as- 
sessed on pooled FI embryos, usually 45-60 days after initial 
injection (Figure 1 D, Step 3). FI embryos from successful FO chi- 
meras (founders) were raised to adulthood, and fish with desired 
alleles were maintained as stable lines and further backcrossed 
to minimize potential off-target editing by Cas9 (Figure ID, 
step 4). We will first describe our results with TERT as a paradigm 
for modeling telomere attrition and then present the general 
toolbox of 13 mutant alleles in genes involved in the hallmarks 
of aging. 

Modeling Telomere Attrition 

Telomerase, which comprises the protein component TERT and 
the RNA component TERC, elongates telomeres after replica- 
tion, thereby maintaining telomere length (Figure 2A). Telomeres 
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Figure 2. Example of Rapid Genome Editing of TERT, the Protein Component of Telomerase, in the Turquoise Killifish 

(A) The telomerase complex and gene model prediction for TERT and TERC using genomic and epigenomic profiling. 

(B) Conservation of TERT protein domains between human (hTERT) and the turquoise killifish (kTERT). 

(C) TERT protein sequence divergence predicts evolutionary tree. Scale: substitution per site. Number on nodes: level of confidence. 

(D) Relative expression of TERT mRNA in brain, liver, testis, and tail using RNA-seq. EPKM, fragments per kilobase of exon per million fragments mapped. 

(E) Successful editing of the turquoise killifish TERT gene. The wild-type (WT) sequence, as well as the length of deletions (A), is indicated relative to the pro- 
tospacer adjacent motif (PAM, in gray) and the guide RNA sequence (gRNA, in red). The deletions that gave rise to stable lines (A3 and A8) are indicated (in yellow 
with black outline). 

(E) Top: location of the gRNA successfully targeting TERT exon 2 (red line), which is upstream of the exons encoding TERT catalytic domains. TERT A8 allele is 
predicted to generate a protein with a premature stop codon. Bottom: the TERT A8 allele is successfully transcribed to RNA, as measured by RT-PCR followed by 
cDNA sequencing. RT, reverse-transcriptase. 



shorten during vertebrate aging, inciuding in the turquoise kiili- 
fish (Artandi and DePinho, 2010; Hartmann et ai., 2009), and 
are considered to be a good biomarker of bioiogicai age (Boone- 
kamp et ai., 2013). In humans, mutations in TERT or other genes 
in the teiomere-protecting compiex resuit in a spectrum of dis- 
eases characterized by tissue homeostasis faiiure, such as dys- 
keratosis congenita (Armanios, 2009). Dyskeratosis congenita 
patients exhibit muitipie symptoms resembiing aspects of pre- 
mature aging, inciuding bone marrow faiiure and puimonary 
fibrosis (Armanios, 2009), reduced fertiiity (Bessier et ai., 2010), 



and severai types of cancers (Aiter et ai., 2009). Because of their 
long telomeres, TERT -deficient iaboratory mice have to be bred 
for four to six generations for disease phenotypes to manifest 
(Lee et ai., 1998) and are therefore not ideai to modei human 
TERT deficiency or teiomere attrition during aging. 

We first asked whether teiomerase components are con- 
served in the turquoise kiiiifish (Figure 2A). The TERT gene modei 
(Figure 2A) ailowed us to predict a putative TERT protein se- 
quence in the turquoise kiiiifish. The predicted TERT protein 
was conserved, particuiariy in the RNA binding and the reverse 
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transcriptase (RT) catalytic domains (Figure 2B). The sequence 
divergence between TERT from the turquoise killifish and other 
species precisely matched the evolutionary tree (Figure 2C), 
confirming that the predicted TERT protein indeed corresponds 
to turquoise killifish TERT. Interestingly, our RNA-seq data re- 
vealed that TERT mRNA expression was enriched in the testis 
relative to other tissues in the turquoise killifish, similar to what 
is observed in humans (Bessler et al., 2010) (Figure 2D). TERC, 
the RNA component of telomerase, as well as other genes en- 
coding proteins associated with telomerase (e.g., DYSKERIN) 
or involved in the protection of telomeres (e.g., TRF2 from 
the Shelterin complex), were also present and expressed in the 
turquoise killifish (Figures 2A and S1). Thus, telomerase compo- 
nents are well conserved between human and the turquoise 
killifish. 

To edit the TERT gene in the turquoise killifish, we designed 
two gRNAs with a targeting region within the TERT exon 2— a 
long exon located upstream to both catalytic domains of TERT 
(the RNA binding and the RT domains) (Figure 2F, top). One of 
the two gRNAs led to the generation of a range of deletions in 
the targeted region of the TERT gene (from 3 bp to 15 bp), with 
a frequency of 1 0% (Figure 2E). By raising the injected embryos 
to sexual maturity (~40 days), we obtained four FO chimeras. 
Crossing each of these four chimeras with wild-type fish allowed 
us to generate stable lines with two different types of deletion in 
TERT. 3 bp (A3) and 8 bp (A8) (Figure 2E). The A8 TERT allele was 
successfully transcribed, as assessed by PCR amplification and 
sequencing of cDNAfrom heterozygous fish (Figure 2F, bottom). 
The A8 TERT allele is predicted to give rise to a premature stop 
codon in the TERT protein, N-terminal to the catalytic domains 
(Figure 2F). These results demonstrate the feasibility for rapid 
genome manipulation in the turquoise killifish, with a total time 
from injection to stable line of about 2 months. 

A TERT -Deficient Line in the Turquoise Killifish Exhibit 
Loss of Telomerase Function and Are Outwardly Normal 

We characterized the fish harboring the zl8 TERT allele, which is 
predicted to result in a TERT protein without catalytic activity. To 
reduce the frequency of potential off-target mutations, we back- 
crossed TERT^^'* fish for three generations. We then crossed 
these heterozygous fish with each other to generate TERT“^^'^^ 
homozygous individuals (generation 1 of homozygous individ- 
uals, G1) (Figure 3A). The ratio of adult G1 TERT^^^^^ mutants 
followed the expected Mendelian ratio (p = 0.8809, test) (Fig- 
ure 3A), indicating no embryonic or juvenile (fry) lethality. Further- 
more, G1 TERT"^^'^^ embryos and adult fish were outwardly 
normal (Figures 3B and S2A). We asked whether the TERT“^^'^^ 
allele was a true loss of function using the Telomere Repeat 
Amplification Protocol (TRAP) (Figure 3C). In this assay, tissue 
extracts are incubated with a radiolabeled oligonucleotide tem- 
plate, followed by PCR amplification of elongated products 
and autoradiography (Figure 3C). This protocol allowed us to 
assess telomerase enzymatic activity in liver extracts from 
TERT'* (wild-type) and TERT^^'^^ homozygous siblings (Fig- 
ure 3D). Whereas liver extracts from wild-type fish showed 
robust telomerase activity, we failed to detect any telomerase 
activity in extracts from TERT^'^^ fish (Figure 3D). Thus, the 
A8 allele of TERT, which is predicted to generate a truncated 



TERT protein, leads to a complete loss of telomerase activity 
and outwardly normal individuals. 

TERT-Deficient Fish Have Age-Dependent Defects 
in the Germline 

Most human patients with haploinsufficiency for telomerase 
develop normally but exhibit a broad spectrum of tissue homeo- 
stasis failure (Armanios, 2009), especially in highly proliferative 
tissues such as blood, skin, intestine, and male germline (Bessler 
et al., 2010). TERT is highly expressed in the germline (Bessler 
et al., 2010) and is considered to be particularly important for 
maintaining the “immortality” of the germline (Zucchero and 
Ahmed, 2006). We first tested the fertility of young (2-month- 
old) G1 TERT^'^^ males compared to control (heterozygous) 
siblings by crossing them to young wild-type females (Figure 3E). 
Whereas control heterozygous male fish were able to fertilize the 
majority of eggs (81 %), G1 TERT^'^^ males only fertilized 9% of 
eggs, indicating a dramatic reduction in fertility (Figure 3F, p < 
0.01, Wilcoxon signed-rank test). Older G1 T£RT^®^"^® males 
(4 month old) showed a further decline in fertility (Figure 3F, p < 
0.05, Wilcoxon signed-rank test, comparison between age 
groups). Consistently, the testes of older G1 TERT^®'^® males 
were atrophied and had an almost complete loss of germ cells 
compared to age-matched wild-type controls (Figure 3G, black 
arrowheads). Germ cells were present in younger G1 TERT"^®'^® 
males (Figure 3G, inserts), suggesting an age-dependent defect 
of the germline. Similarly, G1 TERT^®'^® females also had atro- 
phied ovaries (Figure S2B) and laid fewer eggs than wild-type 
controls (average of 7 ± 4 and 74 ± 25 eggs respectively. Fig- 
ure S2B). Thus, G1 TERT"^®'^® fish show premature defects in 
their germline, resulting in infertility. 

G1 TERT^®''^® fish also displayed defects in other highly prolif- 
erative tissues, including blood (overall decrease in tested blood 
cell types. Figure S2C) and intestine (villi atrophy in some gut 
regions. Figure S2C). Furthermore, as previously reported in 
mouse models for TERT (Artandi and DePinho, 2010; Hao 
et al., 2005), TERT-deficient fish exhibited epithelial adenoma- 
tous changes (decreased polarity and increased nuclear/cyto- 
plasmic ratio) (Figure S2D), which could represent a first step 
toward intestinal cancers such as those found in dyskeratosis 
congenita patients (Alter et al., 2009). In contrast, G1 TERT^'^^ 
fish did not exhibit significant defects in low-proliferative tissues 
such as heart, muscle, liver, and kidney (Figure S2E). As the 
TERT^®^'^® turquoise killifish model exhibits phenotypes in the 
first generation (as opposed to several generations in laboratory 
mice [Lee et al., 1998]) and within 2 months (as opposed to 
6-8 months in zebrafish [Anchelin et al., 2013; Flenriques et al., 
201 3]), it is currently the fastest system to study telomere attrition 
pathologies in vertebrates. 

TERT-Deficient Fish Exhibit Signs of 
“Genetic Anticipation” 

To further explore the effect of TERT deficiency on the germline, 
we tested whether the offspring of TERT-deficient fish exhibit 
signs of “genetic anticipation.” Genetic anticipation is a phe- 
nomenon in which symptoms of a genetic disorder are increased 
in severity or become apparent at an earlier age in the next 
generation, mostly due to cumulative damage in the germline. 
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Dyskeratosis congenita patients show genetic anticipation: 
offspring of affected individuais often exhibit eariier onset and 
more severe symptoms, as weii as shorter teiomeres (Savage 
and Aiter, 2009). To test whether TERT^®'^® fish aiso showed 
signs of genetic anticipation, we crossed the G1 TERT^®'^® ho- 
mozygous fish to generate G2 TERT"^®^^® embryos (Figure 4A). 
Whereas G1 TERT^®'^® embryos were simiiar to wiid-type em- 
bryos (see Figure 3B), G2 TERT^®^"^® embryos showed gross 
deveiopmentai abnormaiities (Figure 4B, right) and aii died prior 
to hatching (Figure 4C). Thus, the severity of phenotype between 
generations increases. To test whether teiomere iength is indeed 
shorter in G2 TERT^®'^® embryos compared to wiid-type or G1 
TERT^®'^® embryos, we used Terminai Restriction Fragment 
(TRF) Southern biot on genomic DNA isoiated from TERT*'*, 
G1 TERT^^'^^, or G2 TERT^^'^^ individuai iive embryos using a 
radio-iabeied teiomeric probe. These Southern biots reveaied 
that the average iength of teiomeres was shorter in G2 TERT"^®'^® 
embryos than in wiid-type embryos (-^1.5 kb versus ~6 kb, 
respectiveiy) (Figure 4D, ieft) and G1 TERT"^®'^® embryos (Fig- 
ures 4D, right, and S3). The dramatic teiomere shortening in 
the G2 generation of TERT-deficient fish, coupied with the in- 
crease in severity of phenotype, is consistent with genetic antic- 
ipation and germiine defects. Thus, we have successfuiiy 
generated a vertebrate modei for teiomerase deficiency that 
rapidiy recapituiates severai characteristics of the correspond- 
ing human disease. Our resuits aiso provide a proof of principie 
for the use of genome editing in a naturaiiy short-iived vertebrate 
as a powerfui way to quickiy test the function of a gene invoived 
in human disease and aging. 

Site-Specific Precise Editing: Generating a 
Disease-Causing Nucleotide Mutation in TERT 
and Inserting a Short Sequence in POLG 

A iarge proportion of human diseases are not caused by deie- 
tions but by singie nucieotide mutations that resuit in amino 
acid changes (non-synonymous mutations) (Abecasis et ai., 
2012). Therefore, we tested the feasibiiity of editing a specific 
amino acid residue, taking advantage of homoiogy-directed 
repair (HDR) instead of the iess precise NFiEJ (Figure 5A). To 
this end, we co-injected Cas9 mRNA, one gRNA, and a singie- 
strand DNA (ssDNA) tempiate with a mutation at the desired 
site to modify the corresponding genomic residue via FiDR (Fig- 



ure 5A) (Bedeii et ai., 201 2). in human TERT, aimost aii of the dis- 
ease-associated mutations are non-synonymous (Podievsky 
et ai., 2008), and many are conserved in the turquoise kiiiifish 
(Figure 5B). We seiected an evoiutionary conserved iysine 
(K902 in human TERT) whose mutation to arginine (Parry et ai., 
2011) or asparagine (Armanios et ai., 2005) gives rise to dysker- 
atosis congenita. This iysine residue corresponds to K836 in the 
turquoise kiiiifish (Figure 5B). To specificaiiy edit K836, we de- 
signed a singie gRNA in the proximity of the region encoding 
this amino acid and an ssDNA tempiate containing two point mu- 
tations: one that changes K836 to R and another that prevents 
Cas9 from further targeting the edited site (Fisu et ai., 2014) (Fig- 
ure 5C, top). Direct sequencing indeed reveaied nucieotide 
changes ieading to the K836R mutation in the turquoise kiiiifish 
TERT (Figure 5C, bottom). 

We next tested the feasibiiity of preciseiy knocking in a short 
exogenous sequence using FIDR (Figure 5A), this time targeting 
another candidate gene, the mitochondrial DNA Polymerase y 
{POLG). We designed a gRNA targeting exon 2 of POLG and an 
ssDNA template containing short homology arms and an exoge- 
nous Ndel restriction sequence (Figure 5D, top). We chose to 
target exon 2 of POLG, as it has a very high targeting efficiency 
(90%, Figure S4). Direct sequencing or digestion with Ndel re- 
vealed in-frame knockin of the Ndel restriction site into the 
genomic sequence ofturquoise kiiiifish POLG (Figure 5D, bottom). 
Thus, precise genome editing allowed us to generate a specific 
human disease-causing mutation in the turquoise kiiiifish TERT 
gene and knockin an exogenous sequence in the POLG gene. 

A Toolbox of Turquoise Kiiiifish Mutants Encompassing 
the Hallmark of Aging 

We next sought to use our platform to target various candidate 
genes within the hallmarks of aging pathways (Lopez-Otin 
et ai., 2013), including cellular senescence and stem cell exhaus- 
tion {p15INK4B), mitochondrial dysfunction {POLG), deregulated 
nutrient sensing {IGF1R, RAPTOR, RPS6KB1, and FOX03), 
epigenetic alterations {ASH2L), genomic instability {SIRT6), 
loss of proteostasis {ATG5), and intercellular communication 
{IL8 and APOE) (Figures 6A and S4 and Table SI). We targeted 
genes whose deficiency is expected to either promote longevity 
{IGF1R, RAPTOR, and RPS6KB1) or accelerate signs of aging 
{TERT and POLG) (Lopez-Otin et ai., 2013). Although some 



Figure 3. TERT‘^^'^^ Fish Show No Teiomerase Activity and Exhibit a Progressive Loss of Fertility in the First Generation 

(A) Intercrossing of TERT'^^'* heterozygous (het) fish to generate generation 1 (G1) TERT^^'"’^ fish. G1 TERT^^'^^ fish are observed at the expected Mendeiian 
ratios (no difference between expected and observed frequencies, p = 0.8809, test). 

(B) G1 TERT^^'^^ embryos (left) and adults (right) are outwardly normal. 

(C) Schematic for TRAP. Teiomerase enzymatic activity in liver is evaluated by the ability of tissue extract to add teiomeric repeats to radio-labeled artificial 
telomeres in vitro. 

(D) Teiomerase enzymatic activity as measured by the TRAP assay in TERT*'* and G1 TERT'‘^"‘^ fish liver samples. IC: TRAP internal control product. Repre- 
sentative of three independent experiments. 

(E) Experimental design to assess male fertility. TERT''^'* (control) and G1 TERT^^"'^ (mutant) males, at two different age groups (2 and 4 months), were mated 
with young (2 months) wild-type (WT) females. Eertillzed eggs (gray) were counted after 1 week. 

(E) Ratio of fertilized eggs per week of egg lay in TERT'‘^'* (control) and G1 TERT"^^"'^ (mutant). Mean + SD of >70 eggs, generated from 4 to 5 crosses per age 
group. Wllcoxon sIgned-rank test, *p < 0.05 and **p < 0.01 . Eor the comparison between age groups, standardized values to age-matched controls were used. 
(G) Histological sections of testis from TERT*'* (control) and G1 TERT^®^^® fish at 4 to 5 months (4 m, full-size image) and 2 months (2 m, insert). Sz, spermatozoa 
(mature sperm); St, spermatids. Scale bar, 50 rim. Representative of n > 6 individuals from each genotype (4 to 5 months) and n = 2 individuals from each 
genotype (2 months). Presence of germ cells in the testis of control fish (top, white arrowheads). Deficiency of germ cells in the testis of TERT-deficient fish 
(bottom, black arrowheads). 



Cell 160, 1 01 3-1 026, February 26, 201 5 ©201 5 Elsevier Inc. 1 01 9 




Cell 



A Design to create second generation (G2) TERT mutant fish 




G1 TER-R‘'^'‘ G1 TER-R’^^o 



G2 TERT‘-‘^‘‘“ 




TERV' TERT*'* 




TERT*'* 



B G2 TERT'^"^ embryos suffer from gross abnormalities D Telomere length measurement 



TERT*'* G2 TERT^'*" embryos 





‘-A ^ 






— 





C Ratio of successful hatching per genotype 



Cl) 



0.5 n 



S. 0.3 ■ 



“ 0.2 



0.1 ■ 



o 0.0 



CD 

o: 




TERT*'* G1 G2 TERT‘""^‘ 



TERT"^* parents G1 TERT'*^''^ parents 



TERT*'* G2 TERT'*"'^ G1 TERT'*‘“^<‘ G2 TERT'*"'^ 

lOkb 
8kb 

6kb 
5kb 

4kb 
3kb 

2kb 
Ikb 

O.Skb 
LC 




Figure 4. TERT-Deficient Turquoise Killifish Exhibit Genetic Anticipation 

(A) Experimental design. G1 TERT^®^^® (left) or TERT*'* (right) fish were intercrossed to generate generation 2 (G2) TERT*'^'"^ or TERT*'* fish, respectively. The 
development of embryos was assessed until hatching. 

(B) Representative images of TERT*'* and G2 TERT^®^^® embryos at an equivalent developmental stage. Scale bar, 300 pm. 

(C) Ratio of successful hatching per week of egg lay for the indicated genotypes. Mean + SD of >70 embryos for each parental genotype {TERT*‘^'* versus G1 

TERT^'ASy 

(D) Telomere length measurement using TRF Southern blot. Left: TERT*'* and G2 TERT''^"'^ embryos. Representative of three experiments. LC: loading control 
for genomic DMA. Right: G1 TERT''^"'^ and G2 TERT"^'*'^ embryos. Expanded version is in Figure S3. White asterisk: non-specific probe binding. 



genes have already been shown to regulate lifespan in both in- 
vertebrates and vertebrates {IGF1R and RPS6KB1) (Kenyon, 
2010), others have not yet been tested in vertebrates {ASH2L 
and FOX03) (Figure 6B). Importantly, some genes do not have 
obvious orthologs in yeast or invertebrates ip15INK4B, IL8, 
and APOE) (Figure 6B). Finally, several genes have been impli- 
cated in human diseases, including APOE (Alzheimer’s disease 
[Rhinn et al., 2013]), TERT (dyskeratosis congenita [Armanios, 
2009]), and p15INK4B (cancer [Okamoto et al., 1995]). 



For each of these 13 genes, we assembled gene models 
and predicted protein sequences, analyzed mRNA expression 
patterns in four tissues, and profiled the FI3K4me3 epigenetic 
landscape to determine TSSs (Figures 6C and S4). We de- 
signed two to five gRNA sequences for each gene (Table 
SI), which was sufficient to identify at least one successful 
gRNA (Figures 6B, 6C, and S4 and Table SI). The efficiency 
of targeting ranged from 0% to 90% depending on the 
gRNA (Figures 6 and S4 and Table SI). So far, we have 
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Figure 5. Precise Generation of Human Disease Mutation in TERT and Insertion of a Short Sequence in POLG 

(A) Genome-editing pipeline for specific point mutations and insertions. ssDNA: ssDNA template. NHEJ: non-homologous end joining. HDR: homology-directed 
repair. 

(B) Top: disease-associated variants in hTERT. Conservation of the disease-causing residues between human TERT and turquoise killifish TERT is color coded 
(red, identical; pink, similar in turquoise killifish TERT). Bottom: K902 in human TERT is evolutionary conserved and corresponds to K836 in turquoise killifish 
TERT. 

(C) Top: location of a selected gRNA (red line) in close proximity to K836 in exon 11 of the turquoise killifish TERT and core sequence of the co-injected ssDNA 
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generated chimeras (FO, adult) for 11 genes (Figure 6B). We 
have examined germline transmission for five of them 
{IGF1RA, IGF1RB, ATG5, ASH2L, and RPS6KB1) and report 



the targeted alleles (Figures 6C and S4). We also have stable 
lines for a subset of these alleles (Figures 6B and S4 and 
Table SI). 



Cell 160, 1 01 3-1 026, February 26, 201 5 ©201 5 Elsevier Inc. 1 021 






Cell 



A Targeted genes and pathways with stable lines 



B Targeted genes and evolutionary conservation 




Toolkit 



Gene conservation 



r / Deregulated \ ' 
sensin 3-^ 

^■^Q)r03 rRPS6K Bl]\P'^^^ 







I 






IGF1RA 


• 


• 


o 
















IGF1RB 


• 




o 
















RAPTOR 


• 




















RPS6KB1 


• 


m 


o 
















FOX03 


• 




















ATG5 


• 


® 


o 
















SIRT6 


• 




















TERT 


• 




o 
















POLG 


• 




















ASH2L 


• 




o 
















CXCL8 (IL8) 


• 




















APOE 


• 


® 


















p15INK4B 


• 


0 


































C Selected examples of targeted genes 
IGF1RA 



GapFilledScaffold_349 

107 

Brain H3K4me3 ChIP-seq 

0 

71 



1 60,000 I 



Brain RNA-seq 
Gene models 



] 1 

I 

] I L „. , 



0 

IGFlRa 



200,000 I 



iJLiiil 



ASH2L_ 

GapFilledScaffold_319 184,000 i 190,000 i 

107 - 

Brain H3K4me3 ChIP-seq 



ii 



Testis 

Tail 

Brain 

Liver 



40kbl- 



H 



{24/40 - 60%) 

GCAAAGGAGTGTGACAACGTCTGCCCGGGCATCATGGGGG WT 

IGCAAAGGAGTGTGAC CGGGCATCATGGGGG A10| 

GCAAAGGAGTGTGACAA l i CGGGCATGATGGGGG A8 

GCAAAGGAGTGTGACAACG ! -"^^ GGCATCATGGGGG A8 

GCAAAGGAGTGTGAC R-CGG — GCCTG — C-^^ TGGGGG Al 



148 



Brain RNA-seq 
Gene models 



1 



OJ U.. 

ASH2L 



.llLlll.. 



■ 


Testis 




Tail 




Brain 




Liver 







lOkbh 



1 

(5/10-50%) 

CTGAAGGAAATGTGCCTCACAGCTCTGGCAAACCTCACAT WT 

CTGAAGGAAATGTGC-TGAT-GCTCTGGCAAACCTCACAT A2 

CTGAAGGAAATGTGC p" — I CTCTGGCAAACCTCACAT A7 

CTGAAGGAAATGTGCCTCACAb— ~GGCAAACCTCACAT A5 



ICTGAAGGAAATGTGC G CTCTGGCAAACCTCACAT A6 | 



APOE 



GapFilledScaffold_3562 46,000 
10 n 



Brain H3K4me3 ChIP-seq 

Brain RNA-seq 
Gene models 



0 - 
3912 - 
0 - 



I 48,0001 

i i.AljJUjki 

uMiikk 






APOE 



p15INK4B 



GapFilledScaffold_999 


57,000 1 59,000 i 


107 - 




Brain H3K4me3 ChIP-seq 




0- 




83- 




Brain RNA-seq 




0- 




Gene models 


P15INK4B — 



Testis 

Tail 

Brain 

Liver 



2kb h 



AGCAGGGAAACTTACTCTAGCTCCCGGGTGAGCTGAGAGG WT 

AGCAGGGAAACTTACTCT h l AGCTGAGAGG A12 

AGGAGGGAAAGTTACTGTAGCTSS^GGGTGAGCTGAGAGG A3 

AGCAGGGAAACTTAGTCTAGC ^CC I GAGGTGAGAGG A6 

AGCAGGGAAACTTACTGTAGCTGAGTAAGTTTCTCTAGAA +20 



Testis 

Tail 

Brain 

Liver 



2kbh 



CTGGTGGCCGCTCAGGCTGACCCGCAGGCGCGGGACAATC WT 

CTGGTGGGGGCTCAGGCTGAGGTGGAGGGCCGGGAGAATC C/T 




CTG6T66CCGCTCAG6CTGACCT6CA6GCCCG6GACAATC 



-1 0 -1 

gRNA, I Indels/substitutionsl Germline transmitted (pooled F1 embryos), [Stable lin^ , Normalized RNA expression |i~ i| 



(legend on next page) 



1022 Cell 160, 1013-1026, February 26, 2015 ©2015 Elsevier Inc. 





Cell 



The platform and toolbox we have developed— genome, 
genomic data sets, gene models, and efficient gRNAs— as well 
as the mutant fish lines, will be made available to the community. 
To facilitate future design of gRNAs in the turquoise killifish, we 
have uploaded the sequenced genome and gene models into 
CHOPCHOP (Montague et al., 2014), thereby providing easy ac- 
cess for the community. Together, our results highlight the ease 
and versatility of our platform for generating mutants in the tur- 
quoise killifish, which will greatly facilitate high-throughput aging 
studies and disease modeling in vertebrates. 

DISCUSSION 

The Turquoise Killifish: A New Vertebrate Model for 
Systematic Studies on Aging and Longevity 

Here, we developed a platform in a naturally short-lived verte- 
brate, the turquoise killifish, for the systematic exploration of ag- 
ing and age-related diseases. The field of aging will greatly 
benefit from the study of species beyond conventional model 
systems (Bolker, 2012). Many exceptionally long-lived verte- 
brates, such as the naked mole rat (^30 years), the Brandt’s 
bat (~30 years), capuchin monkey (~50 years), rock fish (-^150 
years), and the bow-headed whale (-^200 years) (Tacutu et al., 
2013) have already allowed comparative genomics, proteomics, 
and cellular studies (Austad, 2010; Gorbunova etal., 2014). How- 
ever, long-lived species are not well suited for genetic manipula- 
tion, longitudinal, or lifespan studies. The turquoise killifish, with 
its naturally short lifespan, well-characterized aging traits, low 
costs, and ease of maintenance in the laboratory, is highly suited 
for rapid experimental aging research in vertebrates. Further- 
more, the turquoise killifish is currently the shortest living verte- 
brate with a sequenced genome, which will be valuable for 
comparative studies. 

Fish provide several advantages as laboratory species. They 
are amenable to high-throughput approaches such as genetic 
and drug screens (SchartI, 2014). Fish also display a range of 
unique traits. For example, zebrafish, the primary fish model, is 
widely used for developmental processes due to its unique char- 
acteristics (e.g., fast and stereotypic embryonic development). 
Other fish have been used for specific traits, including social be- 
haviors (cichlids [Fernald, 2012]) and adaptive evolution (stickle- 
backs [Jones et al., 2012]). Our genome and genome-editing 
platform in the turquoise killifish should help transition this fish 
to a more widely studied model, providing a unique opportunity 
for high-throughput aging and longitudinal studies. It will be 
important to characterize aging in the mutants we have already 
generated, as well as generating additional ones. Finally, the 
genome-to-phenotype platform we present here could serve 



as a paradigm for how to rapidly develop a wide range of species 
into model organisms. 

A Proof-of-Concept Model for Telomerase-Related 
Pathologies in the Turquoise Killifish 

By targeting the TERT gene in the turquoise killifish, we have 
developed the fastest system so far for studying telomerase pa- 
thologies in vertebrates. Similar to what is observed in dyskera- 
tosis congenita patients, TERT-deficient fish exhibit defects in 
highly proliferative tissues (male germline, intestine, and blood) 
in the first generation and as early as 2 months of age. This killi- 
fish TERT model should help untangle the interaction between 
aging and telomerase pathologies, which is largely unknown 
despite the fact that telomere attrition rate is a good predictor 
of accelerated aging in humans (Boonekamp et al., 2013). 
Although TERT-deficient killifish exhibit specific age-dependent 
defects, we have not observed premature death by 4-5 months 
of age. This might indicate that the defects in regeneration of 
specific tissues are not limiting for lifespan under these condi- 
tions, although they may be detrimental under more stressful 
conditions (e.g., injuries or end of life). It will be important to char- 
acterize lifespan, regeneration, and telomere length in this TERT 
model during aging. It will also be interesting to compare the 
phenotypes of this TERT deletion model with models mimicking 
in killifish the TERT mutations found in human patients. 

The killifish model fills a unique niche in the wide range of ex- 
isting models of telomerase deficiency. Cellular models have 
been extremely helpful to understand telomerase biology and 
pathologies (Batista and Artandi, 2013), but they cannot easily 
recapitulate systemic defects or tissue interactions. Inverte- 
brate models, which have provided crucial insights into telome- 
rase function (Raices et al., 2005), lack some of the organs 
affected by telomere pathologies in humans (e.g., bona fide 
blood) (Gomes et al., 2010). The main vertebrate model system, 
the laboratory mouse, has been the most widely used to under- 
stand the role of telomerase in specific pathologies, particularly 
cancer (Artandi and DePinho, 2010). However, in laboratory 
mouse strains, phenotypes are only manifested after several 
generations because of their extremely long telomeres. This 
issue can be solved by using the castaneus strain, which has 
shorter telomeres (Hao et al. , 2005), but changing genetic back- 
ground is time consuming. Recent studies in zebrafish have 
been promising, with TERT-deficient zebrafish demonstrating 
a range of phenotypes, including gastrointestinal atrophy, pre- 
mature infertility, and death (Anchelin et al., 2013; Henriques 
et al., 2013), although it took those fish at least 6-8 months to 
exhibit most phenotypes. While the turquoise killifish TERT 
model is still limited by the number of available tools, it should 



Figure 6. A Toolkit for Vertebrate Aging and Age-Related Disease Research 

(A) Genes that were successfully edited in the nine hallmarks of aging. Genes and pathways for which we chose to generate stable lines are indicated in yeliow 
with a biack outiine. 

(B) Detailed stages of editing completion in specific genes, color-coded as indicated. Presence of orthologs in different species is indicated in gray. 

(C) Selected examples of targeted genes depicting detailed genomic, epigenomic, and expression information (upper box), relative expression in tissues (lower 
left box), and types of observed indels and substitutions (lower right box). Germline-transmitted alleles assessed in pooled FI embryos are in yellow. Stable lines 
are in yellow with a black outline. Whenever assessed, the targeting efficiency in eggs was indicated as a percentage. For ASH2L, the A6 stable line was 
generated by a separate pair of founders and was not part of the efficiency calculation. Example of a sequencing chromatogram showing the substitution in 
p15!NK4B. 
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be well suited for rapid exploration of telomere pathologies and 
screening for potential treatments that can delay these 
pathologies. 

A Toolkit for Modeling Complex Human Diseases, Traits, 
and Drug Responses 

The advent of personalized medicine and high-throughput hu- 
man genetic studies is providing an overwhelming influx of new 
variants associated with specific human diseases, traits, and re- 
sponses to drugs (pharmacogenetics). However, functional vali- 
dation for most of these genes and variants is lagging behind. 
One way to study these candidates has been to generate induced 
pluripotent stem cells (iPSCs) harboring mutations derived from 
patients or engineered de novo (Hockemeyer et al., 2011). 
Although this approach allows high-throughput studies, it does 
not recapitulate the complex interactions between tissues, 
such as endocrine and paracrine communication, as well as 
complex responses to environment or drugs. The turquoise killi- 
fish model could greatly facilitate in vivo high-throughput studies 
of new candidate genes or alleles, while modeling the integrative 
and non-cell-autonomous interactions that are characteristic of 
aging and pathological conditions. 

Recent genomic studies have revealed that many human 
diseases are caused by deleterious non-synonymous variants 
(Abecasis et al., 2012), and this is also likely the case for aging 
and longevity. For example, most disease-causing mutations in 
human TERT are due to variants leading to a single amino acid 
residue change (Podlevsky et al., 2008). Here, we show the feasi- 
bility of editing specific sequences in turquoise killifish genes. 
Such directed knockin approach will also be particularly helpful 
for the systematic exploration of variants in human longevity 
candidate genes, such as IGF1R, which is among ~200 pre- 
dicted candidates identified by genetic association studies of 
longevity (Tacutu et al., 2013). This approach could also facilitate 
introduction of epitope tags, loxP sites, or artificial stop codons 
at endogenous genomic loci. 

Overall, our study provides a rapid pipeline for genotype-to- 
phenotype analyses in a new vertebrate model with a com- 
pressed timescale of aging. It also renders available as a 
resource the de novo sequenced genome of the turquoise killi- 
fish and mutant lines of this fish. This comprehensive platform 
opens the possibility of screening for genetic and drug interac- 
tions in an integrative system. In addition, it offers a promising 
venue for high-throughput modeling of aging and complex hu- 
man diseases in vivo. 

EXPERIMENTAL PROCEDURES 

Additional details are provided in the Extended Experimental Procedures. 

Gene Model Prediction, Conservation, and Phytogeny 

Gene models were obtained from two independent sources: (1) a de novo 
whole-genome shotgun assembly (GenBank JNBZOOOOOOOO) and (2) a de 
novo transcriptome assembly from four adult fish tissues (brain, liver, testis, 
and tail) using Oases (Schulz et al., 2012) (Sequence Read Archive [SRA] 
SRP041421). For the de novo transcriptome assembly, putative annotations 
were obtained by unidirectional blastx to the Swissprot database. The detailed 
genome assembly and annotation will be reported elsewhere (D.R.V., B.A.B., 
P.P.S., and A.B., unpublished data). 



Strand-Specific RNA-Seq Expression Analysis 

RNA extraction was performed using the Nucleospin kit (Machery-Nagel), fol- 
lowed by rRNA removal (Ribozero Magnetic Gold Kit, Epicenter). Double- 
strand cDNA was ligated with barcoded adapters and amplified using lllumina 
PCR primers (PI .0 and 2.0, lllumina) prior to sequencing. Expression data were 
analyzed by mapping RNA-seq reads onto gene models using Tophat2 v2.0.4 
and Cufflinks v2.0.2. 

H3K4me3 ChIP-Seq 

H3K4me3 ChIP-seq experiments were performed according to Benayoun 
et al. (2014) on whole-brain tissue isolated from adult male fish (SRA 
SRP045718). 

CRISPR/Cas9 Target Prediction for Guide RNA Selection 

For each selected gene, we identified conserved regions in the coding 
sequence using multiple vertebrate orthologs using http://genome.ucsc.edu/. 
Conserved regions that were upstream of functional or active protein do- 
mains were selected for targeting. gRNA target sites were identified using ZiFiT 
(http://zifit.partners.org/) (Hwang et al., 2013) or CHOPCHOP (https:// 
chopchop.rc.fas.harvard.edu/) (Montague et al., 2014). 

Guide RNA Synthesis 

Initial experiments were performed using the DR274 guide RNA expression 
vector (Addgene, 42250) (Hwang et al., 201 3). In subsequent experiments, hy- 
bridized oligonucleotides were used as an in vitro transcription template. 
gRNAs were in vitro transcribed and purified using the MAXIscript T7 kit (Life 
Technologies). 

Production of Cas9 mRNA 

Initial experiments were performed using the MLM361 3 Cas9 expression vec- 
tor (Addgene, 42251) (Hwang et al., 2013). In subsequent experiments, the 
pCS2-nCas9n expression vector was used (Addgene, 47929) (Jao et al., 
2013). Capped and polyadenylated Cas9 mRNA was in vitro transcribed and 
purified using either the mMESSAGE mMACHINE T7 ULTRA or SP6 kits 
(Life Technologies). 

Single-Stranded DNA Template for Homology-Directed Repair 

For homology-directed repair (HDR) experiments, ssDNA templates were de- 
signed to contain short homology arms (30 bp-50 bp) surrounding the gRNA 
target. The ssDNA templates were commercially synthesized and purified prior 
to injection (QIAquick Nucleotide Removal Kit, QIAGEN) (Bedell et al., 2012). 

Microinjection of Turquoise Killifish Embryos and Sequencing of 
Targeted Sites 

Microinjection of turquoise killifish embryos was performed according to Va- 
lenzano et al. (2011). Cas9-encoding mRNA (200-300 ng/|il) and gRNA 
(30 ng/|.il) were mixed with phenol-red (2%) and co-injected into one-cell-stage 
fish embryos. For HDR experiments, the ssDNA template (20 iiM) was also co- 
injected. Three days after injection, genomic DNA was extracted from five to 
ten pooled embryos. The genomic area encompassing the targeted site 
(~600 bp) was PCR amplified. Endonuclease digestions or DNA sequencing 
was used for analysis (Table SI). 

Fish husbandry, telomerase activity and telomere length measurements, 
fertility, histology, and blood count analyses are provided in the Extended 
Experimental Procedures. 

ACCESSION NUMBERS 

Sequencing and genome data have been deposited to the GenBank 
(JNBZOOOOOOOO). RNA-seq (SRP041421) and H3K4me3 ChIP-seq 
(SRP045718) data were submitted to SRA (Sequence Read Archive). 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures, four 
figures, and one table and can be found with this article online at http://dx. 
doi.org/1 0.1 01 6/j.cell.201 5.01 .038. 
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